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Abstract 



This thesis is a compendium of research which brings together ideas from the 
fields of Complex Networks and Computational Neuroscience to address two 
questions regarding neural systems: 

1) How the activity of neurons, via synaptic changes, can shape the topology 
of the network they form part of, and 

2) How the resulting network structure, in its turn, might condition aspects of 
brain behaviour. 

Although the emphasis is on neural networks, several theoretical findings which 
are relevant for complex networks in general are presented - such as a method 
for studying network evolution as a stochastic process, or a theory that allows 
for ensembles of correlated networks, and sets of dynamical elements thereon, 
to be treated mathematically and computationally in a model-independent 
manner. Some of the results are used to explain experimental data - cer- 
tain properties of brain tissue, the spontaneous emergence of correlations in 
all kinds of networks... - and predictions regarding statistical aspects of the 
central nervous system are made. The mechanism of Cluster Reverberation is 
proposed to account for the near- instant storage of novel information the brain 
is capable of. 
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Preamble: The Ant, the 
Grasshopper and Complexity 



Once upon a time, in a charming and peaceful little valley, a grasshopper 
sat under the shade of a sunflower, idly strumming up a tune, when a young 
worker ant came into view. The grasshopper watched as she trundled her way 
laboriously up an incline under the weight of a large piece of leaf. When she 
was close enough, he hailed her: 

'Ahoy there, friend. 1 hope 1 won't seem tactless if 1 point out what a sin- 
gularly cumbersome bit of leaf you have there. Would you not rather put it 
down for a while and join me for a quick jam session? You could bang along 
on some twigs or something.' 

'Thank you for the offer, but I must continue on my way,' replied the ant, 
glancing up in slight surprise at being thus addressed. 

'Oh, what a pity,' the grasshopper rejoined. 'And where, if I may be so bold as 
to inquire, would you be taking your rather unappetising ration of cellulose?' 
'Well, I can't say I really know... I just follow this trail of pheromones I've 
come across. I'm sure it's for some noble purpose though.' 
'Ah, that must be reassuring. And I suppose when you get to wherever it is 
you don't know you're going you intend to eat your bit of leaf...' 
'Oh no, I can't digest something hke this - who do you take me for?' 
'You can't? Well, how strange...' 
'What's strange?' 

'However did an animal evolve which, instead of engaging in biologically rea- 
sonable (not to mention enjoyable) activities, such as playing music to attract 
sexual partners, prefers to lug useless bits of leaf about? How on earth can 
that serve to spread your genes?' 

'I'm not interested in music or sex, whatever those are. I just follow simple 
rules, like all my identical sisters. You could say we're automata.' 
'Thanks, I was going to but wasn't sure whether you'd be offended. Well, let 
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me wish you an agreeable day of toil, you frigid little automaton.' 

With that, the grasshopper gave a big leap into the air, slightly exasper- 
ated by the folly so often displayed by his fellow insects. Looking down, he 
spotted a few more ants, all carrying leaves in the same direction as the one 
he had just met. Intrigued, he fluttered slightly higher (since grasshoppers 
can, actually, fly, if not all that well). He reahsed the ants were all heading 
for a nest some way off. In fact, there were many ant trails leading to various 
sources of food. It dawned on the grasshopper that although the individual 
ants were just boring little morons idiotically following rules, the nest as a 
whole was managing to find the closest leaves, bring them back along optimal 
routes, and feed them to its plantations of fungi. The colony was behaving 
like an intelligent organism, in some respects not so different from he himself, 
who functioned thanks to the cells of his body - each with the same genome, 
hke the ants - cooperating through the obedience to relatively simple rules. 

This thought impressed the grasshopper very much, driving him to flutter 
even higher so as to see things in greater perspective. Prom there he considered 
the apparently fragile web of trophic, parasitical and symbiotic interactions 
linking all the living beings in the valley - a network which nonetheless must 
have evolved a particularly robust structure not to shatter at the first environ- 
mental fluctuation. He became so enthralled by the idea of such complexity 
on one scale emerging from simplicity on another that he didn't even pay any 
attention to an attractive young grasshopperess making her wanton way just 
below him. Instead, he couldn't help fearing that a butterfly he noticed gently 
flapping his wings would probably set off a hurricane somewhere. As he flew 
ever higher, he began to see snowflakes glide by, overwhelmingly intricate and 
beautiful patterns self-organised out of the simplest httle water molecules. Fi- 
nally he was so high that he began to reflect on how the very stellar systems, 
galaxies, clusters, superclusters, filaments of galaxies... - of which his whole 
world was but an infinitesimal component - also interacted with each other 
via the simple rules of gravity and pressure to form objects marvellous beyond 
conception. 

What he didn't notice until it was too late, as he left behind the cosy 
protection of the atmosphere, was how ultraviolet sunlight and ionising cos- 
mic rays were steadily burning his wings each to a crisp. Beginning to fall, he 
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only hoped he would have time to consider the several morals to his tragic tale. 

After a while spent plummeting to his doom he reahsed that, the freefall 
terminal velocity and life expectancy of a grasshopper being what they respec- 
tively were, he would most likely die peacefully of old age somewhere along 
his way down - never again contemplating his Edenic valley except, like some 
prophetic locust, from afar. 
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over 100 runs and global probabilities as in Eq. (13. 6p . The- 
oretical estimates correspond to Eqs. f l3.12p . f l3.14p and (13. lip 
applied to the networks generated by the same simulations. The 
last column lists the respective config uration m o del val ues: C 



and / are obtained theoretical 



r, from MC simulations as in f Maslov et al 



V as in flNewmanl. 



2003d), while 



200J), is the value 



expected due to the absence of multiple edges. (See also Fig. 13.41 1 



C.l Food webs appearing in Fig. IC.3I (listed from least to most 
assortative) : r is the assortati vity and p the nest edness. The 



origins of all data cited in Ref. (iDunne et al.Ll20o3 ). and kindly 



provided to us by Jennifer Dunne 

C.2 Empirical networks appearing in Fig. IC.3I (listed from least to 
most assortative) : r is the assortativity and u the nestedness. 
All data available on the personal Web pages of Alex Arenas, 
Mark Newman and Duncan Watts 
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Chapter 1 

Resumen en espafiol 



Paradigma de sistema complejo y el peor comprendido de nuestros organos, el 
cerebro es, esencialmente, una inmensa red de celulas que se comunican entre 
SI mediante sefiales electro-qmmicas. Este trabajo recoge y desarrolla ideas 
del joven campo de las Redes Complejas para tratar de mejorar nuestro en- 
tendimiento acerca del comportamiento colectivo complejo que puede emerger 
en las redes de neuronas a partir de dinamicas individuales relativamente sen- 
cillas. 

El Capitulo [2] es una breve introduccion a las Redes Complejas y a la 
Neurociencia Computacional. Se describe, entre otras cosas, el modelo de 
Hopfield de red neuronal atractora, en que cada nodo representa una neurona 
y las sinapsis son representadas por los enlaces. Este sistema puede almacenar 
informacion, en forma de patrones o configuraciones concretas de neuronas 
activas e inactivas, en los pesos sindpticos; es decir, en la intensidad con la 
que la actividad de una neurona influye sobre sus vecinas. Si, para representar 
un patron dado, dos neuronas vecinas ban de adoptar el mismo estado (activo 
o inactivo), se refuerza la interaccion entre ambas, mientras que se debilita 
en caso contrario. Repitiendo esta operacion para cada pareja conectada de 
neuronas y para cada patron, estos patrones se convierten en los estados que 
minimizan la energia total (atractores de la dinamica), y el sistema evolu- 
ciona siempre hacia el patron que mas se parezca a su estado inicial. Este 
mecanismo, Uamado de memoria asociativa, es la responsable del almacenaje 
y la recuperacion de informacion tanto en modelos mas realistas de medios 
neuronales, como en muchos aparatos artificiales que desempeiian tareas tales 
como la identificacion y la clasificacion de imagenes. Ademas, hoy en dia 
existen evidencias experimentales suficientes para asegurar que algo parecido 
ocurre en el cerebro: mediante los procesos bioquimicos de potenciacion de 
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largo plazo (LTP, por sus siglas en ingles) y depresion de largo plazo (LTD), 
las sinapsis modifican gradualmente sus conductancias durante el aprendizaje. 

El Capitulo [3] aborda el problema de como puede desarroUarse una red con 
el tipo de estructuras que se observa en el cerebro. Para ello se formaliza como 
un proceso estocastico una red que evoluciona mediante cambios probabilisticos 
que dependen de cualquier manera de informacion local y glob al de los grados 



(numeros de vecinos) de los nodos, tal como se liace en la Ref. (jjohnson et al. 



2010al ). Se considera que estas suposiciones son relevantes para el caso del 



cerebro ya que la arborizacion y la atrofia sinapticas dependen de la actividad 
electrica de la neurona en cuestion, que a su vez puede estar relacionada con 
el numero de vecinos que tenga, y con la densidad sinaptica media en la red. 
Se demuestra como esta situacion viene descrita por una ecuacion de Fokker- 
Planck, y se aplica a dos conjuntos de datos reales neurofisiologicos: por una 
parte, la curvas de poda sinaptica (fuerte reduccion de la densidad sinaptica 
que sufre el cortex durante la infancia) de autopsias liumanas pueden explicarse 
con Unas suposiciones minimas; por otra, varias magnitudes estadisticas de la 
red del anelido C. Elegans (distribucion de grados, perfil de correlaciones, clus- 
tering o agrupamiento y camino mi'nimo medio) emergen con cierta precision 
y de manera natural justo en la transicion de fase que presenta el modelo. 
Esto da fuerza a la idea de que el sistema nervioso optimiza su rendimiento 
colocandose cerca de un punto critico. Un caso parecido, en que los enlaces 
de la red, en vez de de saparecer o aparecer, son redirigidos estocasticamente, 
presentado en la Ref. (jJohnson et al.l . l2009b| ). se describe en el Apendice El 

El resto de la tesis se centra en los efectos que pueden tener sobre el com- 
portamiento colectivo de sistemas de neuronas las caracteristicas topologicas 
descritas en el Capitulo El Se sabe que la heterogeneidad de la distribucion 
de grados de la red suele tener una infiuencia significativa en la dinamica de 



elementos conectados mediante sus en 



Hopfield, Torres et al. (ITorres et al. 



aces. En el caso de redes neuronales de 



2004] ) demostraron que, en redes litres 



de escala (que son altamente lieterogeneas) , el rendimiento aumenta con la 
heterogeneidad. El Capitulo IH examina el mismo efecto en una red neuronal 
que incluye otro ingrediente biologico: la depresion sinaptica, gracias a la cual 
se observa una transicion entre una fase de memoria estatica a otra en que el 
sistema salta caoticamente entre los patrones guardados. Resulta que cerca 
de este punto critico (el famoso Borde del Caos) la red es capaz de realizar 
una tarea dinamica necesaria para los seres vivos: reconocer, de entre varios 
patrones que tenga almacenados, uno dado que se le "ensene" y retenerlo in- 
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definidamente despues. Como demostramos en la Ref. (I Johnson et al.l . |2008[ ) . 
la heterogeneidad de la distribucion de grados de la red acerca el punto critico 
a una region del espacio de parametros en que apenas hay depresion sinaptica. 
Teniendo en cuenta que esta depresion empeora la capacidad de memoria del 
sistema, se concluye que una red altamente heterogenea es la optima para re- 
alizar este tipo de tarea dinamica. Las redes funcionales medidas en el cortex 
humano durante tareas del estilo adopta la red libre de escala mas heterogena 
posible, por lo que cabe la hipotesis de que el cerebro este maximizando asi su 
rendimiento. 

Otra propiedad altamente estudiada de las redes complejas es la existencia 
de correlaciones entre los grados de nodos vecinos. Cuando dichas correla- 
ciones son positivas (nodos muy conectados se suelen conectar con otros que 
tambien tienen muchos vecinos, y los que tienen pocos con otros parecidos) se 
dice que la red es asortativa; mientras que es disasortativa si las correlaciones 
son negativas (los que tienen muchas conexiones se conectan, preponderante- 
mente, con los que tienen pocas). Curiosamente, se habi'a observado que por 
lo general las redes sociales (por ejemplo, redes de colaboraciones profesionales 
o de contactos sexuales) son asortativas, mientras que practicamente todas las 
demas (geneticas, troficas, proteicas, de transportes, de palabras, Internet, la 
Web...) son significativamente disasortativas. Aunque se habia estudiado los 
efectos de estas correlaciones en varios sistemas, las tecnicas matematicas y 
computacionales para ello padecian de inconvenientes que restaban general- 
idad a los resultados. Para solventar esto, en el Capitulo E] se describe un 
nuevo metodo para particionar el espacio de las fases de redes en regiones de 
correlaciones iguales, una tecnica que permite tanto analisis teorico como com- 
putacional de este tipo de sistemas. Utilizando este metodo junto con ideas 
de Te oria de la Informacion se demuestra tambien el resultado principal de la 



Ref. fiJohnson et al.l . l2010bl ): que la disasortatividad es el estado "natural" 
(en cuanto a situacion de equlibrio) de las redes heterogeneas, lo cual explica 
la preponderancia en la reahdad de este tipo de configuraciones. La prefer- 
encia de los humanos por agregarse en funcion de propiedades similares seria 
la explicacion de que las redes sociales se encuentren fuera del equilibrio, en 
regiones asortativas del espacio de fases. 

En el Capitulo E] se aplica el metodo del Capi'tulo al caso de una red neu- 
ronal de Hopfield que no solo presenta heterogeneidad, sino ta mbien correla- 



cione s nodo-nodo. Se encuentra, como ya fue descrito en la Ref. (jde Franciscis et al. 



201l|), que estos sistemas pueden aumentar de manera notable su robustez 
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frente a ruido gracias a las correlaciones positivas. De nuevo, esto parece 
encajar, al menos cualitativamente, con resultados experimentales que han 
encontrado redes funcionales en el cortex humano altamente asortativas. 



Hemos diclio que las redes neuronales pueden aprender gracias a una apropi- 
ada modificacion de los pesos sinapticos mediante LTP y LTD, lo que explica 
la memoria de largo plazo. Pero estos procesos bioquimicos ocurren en un 
tiempo caracteristico de al menos minutos. Los modelos de memoria de corto 
plazo, que ocurren en escalas de tiempo menores, suelen dar por hecho que 
la informacion que se utiliza esta de antemano almacenada en el cerebro, y 
que el sujeto realizando la tarea solo ha de activar y mantener de alguna 
manera la configuracion correcta (como en el CapftuloH]). Pero es facil darse 
cuenta de que esto no puede ser el caso para cualquier tarea: basta mirar 
algo totalmente nuevo por un instante, cerrar los ojos, y pensar en lo que 
se ha visto. Los unicos modelos de memoria de corto plazo existentes que 
no requieren aprendizaje sinaptico se basan en que cada neurona mantenga 
de alguna manera la informacion que le corresponde (tipicamente gracias a 
una serie de procesos sub-celulares) . Pero al no emerger la memoria como 
propiedad colectiva del sistema, sino como suma de memorias individuales, 
estos modelos padecen de una gran falta de robustez frente a ruido. Y, lejos 
de presentar un comportamiento individual fiable, las neuronas se caracterizan 
justamente por ser celulas de una alta variabilidad, con tendencia a disparar 
de manera mas o menos aleatoria. En el Capi'tulo[7]se propone un mecanismo, 
llamado Cluster Reverberation (CR), o Reverberacion de Grupo, gracias al cual 
incluso sistemas como redes con unidades simples, binarias, como en el modelo 
de Hopfield pueden almacenar informacion instantaneamente sin necesidad de 
aprendizaje sinaptico, y de una manera que puede ser todo lo robusto frente 
a ruido como se quiera ( iJohnson et al.l . 120111 ) . Para ello el sistema aprovecha 
la existencia de estados metastables (situaciones que minimizan la energfa del 
sistema localmente, sin corresponder al mmimo global) y como consecuencia 
aparecen transitorios en la dinmica de la actividad neuronal cuyas propiedades 
son consecuencia inmediata de las caracterfsticas de la topologfa subyacente y 
que es del tipo de las descritas anteriormente en el Capitulo|3]y en experimen- 
tos, esto es, el grado de agrupamiento o la modularidad de la red. Basicamente, 
grupitos densamente interconectados de neuronas pueden mantener un estado 
CO nj unto de alta o baja actividad, en promedio. Considerando cada grupito 
como un elemento funcional elemental, en vez de cada neurona, se consigue la 
aparicion de las propiedades requeridas. Es mas, algunas otras caracterfsticas 
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de la memoria de corto plazo emergen de manera natural de este mecanismo. 
En particular, se demuestra que la informacion se pierde gradualmente con el 
tiempo segiin una ley aproximadamente potencial, como ha sido descrito en 
experimentos psicofi'sicos. 

En conclusion, las principales aportaciones originales de esta Tesis son: 



Metodos anal fticos y computacionales para estudia r redes evolutivas (IJohnson et al. 



2009b, 



2010al) y redes con correlaciones nodo-nodo (iJohnson et al.l 



2010b 



de Franciscis et al.. 



20111). 



Una respuesta a l a pregunta de por que la mayorfa de las redes reales 



son disasortativas (IJohnson et al. 



2010b|). 



Propiedade s topologicas que pueden optimizar el rendimi ento de redes 
neuronales ( jjohnson et al.l . 



2008; 



de Franciscis et al. 



201 If ). 



Un mecanismo que p udiera estar detras de la memoria de corto plazo 



Johnson et al. 



20111 ). 



Chapter 2 

Where we are and where we'd 
like to go 

2.1 From bridges to brains 

Strolling through the streets of Konigsberg, a young Immanuel Kant may have 
wondered whether, as some hoped, a path could be found that would take him 
once and only once over each of the city's celebrated seven bridges and back to 
where he started. In 1736 Leonard Euler pointed out that for this or any other 
problem of the kind all that mattered was which land masses were connected 
to each other, and by how many bridges. In other words, the situation could 



be captured by a graph, as in Fig. 12. in which each land mass is represented 
by a node (also called vertex) and each bridge by a link (or edge). He showed 
that in the case of Konigsberg no such walk could be found, since an "Eulerian 
cycle" in a connected graph exists if and only if the degrees of all nodes ar e 



17361 ) 



even numbers - the degree of a node being its number of edges ([Euler 
And thus was Graph Theory born. 

For over two centuries, the graphs people were interested in were precisely 
defined objects, usually sufficiently small to be drawn on a piece of paper. But 
in the late nineteen fifties, mathematicians began to study random graphs - 
i.e., defined only by some random generation process - perha ps with a view t o 



better dealing with ever-growing communications networks fiBollobas 



2001 



E. N. Gilbert considered a situation in whi ch there are n nodes and each pair 



is connected by an edge with probability q (iGilbertl . Il959f ) . For different values 
of these parameters, he was able to obtain the likelihood of the graph being 
connected (that is, of there being a path joining any two nodes). A similar 
model was proposed by Paul Erdos and Alfred Renyi: each of all the possi- 
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Figure 2.1: The problem of the Seven Bridges of Konigsberg can be reduced 
to a graph in which nodes and edges represent land masses and bridges, re- 
spectively. 



ble graphs with n nodes and m edges had an equal chance of being "picked" 
(lErdos and Renyil . Il959l ). In fact, a given graph will be generated with equal 
probability in either scenario, so the descriptions are equivalent, and usually 
known as the Erdos-Renyi (ER) model. It is simple to see that if one were to 
average over many graphs generated by either of these processes, the degrees 
would follow a binomial distribution - tending, for large n, to a Poisson dis- 
tribution. That is, p{k) is symmetrically centred around its mean value and 
drops off exponentially - where ki is the degree of node i. An interesting phe- 
nomenon that can be observed using the ER model is that of percolation. If we 
measure the size $ of the largest connected component (that is, of the highest 
number of nodes in the graph forming a connected subgraph) we obtain at 
different values of the probability q (or, equivalently, of m = \qn'^), we see 
that there is a critical value, qc = 1/n, above which $/n does not vanish for 
high n - that is, there will usually exist a connected subgraph of a size compa- 
rable to that of the whole system. This passing from one situation (or phase) 
to another, each characterised by some qualitatively different characteristic, is 
known in physics as a phase transition. In this case, it is a second-order tran- 
sition, since the control parameter $ varies continuously (and not abruptly, as 
in first-order transitions), and has innumerable applications. For instance, the 
nodes might be people susceptible to some disease, trees which may be set on 
fire, or oil bubbles in a porous medium. The epidemic will spread, the forest 
will burn, or the oil will be extractable if the density of edges - contagious 
contact, fire-conducting proximity, or links between pores - is over the critical 
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value. 



In his 1929 short story Chains {Ldncszemek, in the original Hungarian) 
Frigyes Karinthy suggested that the number of people in a chain of acquain- 
tances grows exponentially with size, and thus that very few steps are needed 
to join anyone with any other person. This Small World idea was taken up in 
1967 by Stanley Milgram, who performed a series of experiments that, while 
somewhat less controversial than his well-known obedience-to-aut hority explo- 



rations, have nonetheless been widely discussed ((Milgram 



19671 ). He and his 



colleagues sent various letters to random people with the request to attempt 
to send them on to a particular individual many thousands of miles away, plus 
the constraint that this had to be done via people with whom the sender was 
on first-name terms. Many of the letters reached their destination, after having 
been sent on by a surprisingly small number of intervening people. This was 
later popularised as the Six Degrees of Separation - the famous idea that any 
two people are linked by a path of only six acquaintances. Within the con- 
nected component of an ER random graph any two nodes are also joined by a 
short path - of the order of the logarithm of the number of nodes. However, 
this is less surprising, since these networks are not clustered; that is, they do 
not have the property typical of social networks whereby "the friends of my 
friends are (also likely to be) my friends." In 1998, Duncan Watts and Steven 
Strogatz put forward a network model which took this feature into account. 
They considered a ring of n nodes, each connected to their k nearest neigh- 
bours (they set k = A). Each edge was then broken from one of its nodes and 
rewired to some other random node with a probability p. Thus, p = leaves 
the ring intact, while p = 1 changes it into an ER random graph. Two magni- 
tudes were measured for different values of p: the mean-minimum-path length, 
/, and the clustering coefficient, C. The first is simply an average over all pairs 
of nodes of the minimum-paths connecting them. The latter can be seen as 
the probability that two neighbours of a given node are directly connected to 
each other. For p = 0, the clustering is high (C = |) and independent of 
the network size, while the mean minimum path scales with n (/ ~ n/S). At 
the other extreme, p = 1 yields a vanishingly small clustering [C = k/n) but 
short paths (Z ^ Inn/ In A;). The most interesting case is found at intermediate 
values of p. As p grows from zero, / falls very rapidly to a value close to the 
random case, but C does not present this drop until a much higher value is 
reached. Watts and Strogatz called this intermediate zone the Small- World 
region, since everyone is highly-interconnected while much of the local struc- 
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ture is conserved. They suggested that this is actu ally a property o f many 
real networks (as has since turned out to be the case ( iNewmaru . l2003d )). most 
especially of social networks - in which C is often several orders of magnitude 
greater than if the graph were random, while / is not much larger than in such 
a case. As the authors point out, it is essential to take this feature into account 
for the study of, say, epidemics. 

Another feature of networks which is quite ubiquitous in the real world is 
that degree distributions are highly heterogeneous; in fact, they often follow 
power-laws, p{k) ~ k""* , with 7 a positive constant typically between 2 and 3. 
Such networks are nowadays referred to as scale free. In the nineteen fifties, 
Herbet Simon showed that these distributions come about when "the rich get 
richer" (Simon, IQSsI ). Applying this idea to the case of scientific citations, 
Derek de Solla Price proposed the first known model of a scale-free n e twork , 
in which nodes represent papers and edges are citations (Ide S. Pried . 119651 ). 
Each node has an in-degree (the number of papers citing it) and an out-degree 
(papers it cites). That is, the network is directed, since edges have a direc- 
tion. Assuming that the probability a paper has of being cited by a new one 
is proportional to the number already citing it (its in-degree), the network is 
built up through the gradual addition of nodes, the neighbours of these be- 
ing chosen according to their existing in-degrees. Price showed analytically 
that such a mechanism leads to an in-degree distribution p{k) ~ 
with m a parameter of the model equivalent to the mean degrefl He called 
this mechanism cumulative advantage. Somewhat ironically - considering that 
Price, with a PhD in history from Cambridge, is best known as the father 
of scientometrics - this work was mostly ignored by the scientific community. 
The model was rediscovered in 1999 by Albert-Laszlo Barabasi and Reka Al- 
bert, with the differe n ce th at they considered the network to be undirected 
(IBarabasi and Albertl . Il999l ). They coined the term preferential attachment 
for the rich-get-richer mechanism, which is now generally assumed to be be- 



hind the formation o 



exist f Caldarelli et al. 



most scale-free networks (a. 



2002 



Krapivskv et al. 



2000 



though other rn echanisms 
Newmanl . 2005 )). Among 



many interesting consequences of such degree heterogeneity, Mark Newman 
showed that the clustering and mean-minimum-path length are respectively 
higher and lower than in homogene ous netw o rks, m aking all scale-free net- 
works to some extent small worlds fiNewmanl . l2003bl ). It also has important 



^Notc that in a directed network, the mean in-degree and mean out-degree coincide. 
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consequences for dynamical processes taking place amo ng elements on the net- 
work , such as the synchronisation of coupled oscillators (IBarahona and Pecoral . 



2OO2I). 



As mentioned above, networks can be made up of separate components 
such that no path exists between nodes in different subgraphs. This is an 
extreme case of community structure. However, what is usually more inter- 
esting is the fact that communities may exist such that there is a higher 
density of edges within them than without, even if the network is connected 
(jcirvan and Newman . 2002 ). These communities are also at times called mod- 



ules or clusters (although this can create some confusion with the related but 
distinct idea of clustering referred to above). Given a network, one can make 
a partition - that is, divide the nodes up into groups - and calculate what 
proportion of the edges fall within these, compared with the random expecta- 
tion. This measure is called the modularity of this partition, and sometimes 
one speaks of the modularity of a graph referring to that of the partition 
for which this value is maximum. Determining the community structure of 
empirical networks can often provide useful insights into aspects of the sys- 
tems. For instance, the communities may correspond to functional groups 
in a metabolic network, or groups of people who share some trait. How- 
ever, there are many problems related to making this kind of measurement. 
For one thing, there are so many possible partitions that even an ER ran- 
dom graph can have a fairly high mod ularity due simply to statistical fluctua- 
tions (IReichardt and Bornholdtl . |2006| ) . Then there is the fact that community 
structure can exist on may different levels - that is, the groups considered 
can be of any size - so one must usually consider a hierarchy of modules 
(j Arenas et al.l . 120061). Furthermore, fin d ing a n optimal partition is an NP- 



Complete problem (IGarey and Johnson! . Il979l ). which makes comparing the 
modularity of each possible partition intractable in all but very small net- 
works. For these and other reasons, in recent years much work has gone into 
finding efficient algorit hms to determine the community structure of networks, 

Donetti and Munozl . 



albeit approximately (iGirvan and Newmanl . 



2002 



Blondel et al. 



20041 : 



20081) - as well a s into comparing the results offered by each 



approach (IL. Danon et al.l . l2005l ) . 

Finally, another feature of networks worth mentioning arises when the 
nodes of a network are endowed with some property and this is re flected by 
the layout o f the edges: the situation is called a mixing pattern (INewmanl . 



2002, 



2003al ). For instance, people might tend to choose sexual partners who 
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share certain characteristics, such as mother-tongue or seh-defined race. In 
these cases the network is assortative, since nodes of a kind assort, or group 
together. However, if the property in question were, say, gender, then the same 
graphs would be disassortative if most of the relations were heterosexual. In 
these cases the property can be considered discrete, but it can be continuous 
- for instance, people might assort according to age. An interesting case is 
when the property in question is the degree of each node, since it is then an 
entirely topological issue. The extent to which the degrees of neighbouring 
nodes are corre l ated - as given, for example, by Pearson's correlation coeffi- 
cient (INewmanl . |2002[ ) - is then a measure of the assortativity of the graph, 
being positive for assortative networks and negative for disassortative ones. 
It turns out that there is a striking universality in the nature of the degree- 
degree correlations displayed by real-world networks, whether natural or artifi- 
cial: social networks, like the ones just described, tend to be assortative, while 



almo s t all other kind s of network are disassortative ( jPastor-Satorras et al. 



2001 



iNewman 



2003^ ). Often specific functional constraints can be found to 



justify correlations of one or other kind, but in Ch apter [5] of this thesis t he 



purely topological explanation put forward in Ref. f ) Johnson et al 



2mm is 



described. In any case, the degree-degree correlations of a network can play a 
significant role in the behaviour of processes taking place thereon. For exam- 
ple, assortative net works have lower p ercolation thresholds and are more robust 
to targeted attack (INewmanl . l2003al ). while disassortati ve ones make for m ore 
stable ecosystems and seem to be more synchronizable (IBrede and Sinhal ). 



All the aspects of networks mentioned in this brief overview, as well as many 
others, have been shown to be relevant for a wide range of complex systems 
(Albert and Barabasi. 



2002 



Newman 



2003c; 



Boccaletti et al. 



20061). Among 



these is the brain, a paradigm of complexity as well as the least understood 
of our organs. However, research focusing on how the collective behaviour 
of neural systems, as observed in mathematical models, is infiuenced by the 
topology of the underlying network is relatively scarce. This is perhaps due in 
part to the attention that other biological properties of the nervous system have 
tended to draw from the Computational Neuroscience community. Thanks to 
the fiurry of activity that the field of Complex Networks has been enjoying 
over the last decade, this is a particularly good moment to undertake a more 
systematic analysis of how dynamics and topology are related in this kind of 
systems. 



2.2 Neural networks in neuroscience 
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Ever since the publication of Santiago Ramon y Cajal's drawings of neurons - 
in his words, those "mysterious butterflies of the soul" - it has been clear that 
the nervous system is composed of a large num ber of such cells connected to 
one another to form a network (jv Cajall . Il995l ). Long axons, ending in termi- 
nals which form synapses to the dendrites which branch out from neighbouring 
neurons, transmit action potentials (APs) - changes in the cellular membrane 
voltage - and enable neurons somehow to cooperate and give rise the aston- 
ishing emergent phenomenon called thought. One of these APs is formed each 
time the membrane electric potential of a neuron surpasses a threshold value, 
leading to the opening of a great many voltage-dependent ionic gates between 
the cell and the extra-cellular medium. In turn, the membrane potential of a 
given neuron is constantly affected by action potentials arriving from neigh- 
bouring neurons, and thus an extremely complex web of cellular signalling is 
achieved. 



The first model neuron was proposed by lMcCuUoch and Pittd fll943l ). This 
was simply an element that would return a Heaviside step function of the sum 
of its inputs. Sets of such "artificial neurons" could be used to implement 
any logical gate. Shortly after this, another important suggestion was made, 
this time by the psychologist Donald Hebb. Attempting to relate Pavlovian 
conditioning experiments with cellular plasticity, he conjectured, in 1949, the 
existence of some biological mechanism that would lead to neurons which re- 
peatedly fired ( i .e., le t off action potentials) together becoming more strongly 
coupled flHebbl . 119491 ). The initiation and propagation of action potentials 
in individual neurons was first modelled mathematically by Alan L. Hodgkin 
and Andrew Huxley in 1952 by means of a set of nonline ar ordinary differential 
equat ions which took into account the various ion fiuxes (iHodgkin and Huxleyl . 
I952L 

However, the concept of a neural network (as understood in theoretical 
and computational neuroscience) was partly inspired by mathematical models 
of spin systems. The first of these was the Ising model, put forward in 1920 
by Wilhelm Lenz and studied by Ernst Isin g with a v iew with a v i ew to un- 



Brush 



19671 ). It 



derstanding phase transitions and magnets flOnsagerl . 11944 ; 
was known that the spin of electrons conferred a magnetic moment to indi- 
vidual atoms, but it wasn't clear how exactly a very many such spins could 
self-organise into a large body producing a net magnetic field. By considering 
an infinite set of entities (spins) with possible values plus or minus one (up 
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Figure 2.2: Drawing of the cells of the chick cerebellum, from "Estructura de 
los centros nerviosos de las aves", Madrid, 1905. Notice how the neurons make 
up a complex network of synaptic interactions. 



or down, say) which, when placed at the nodes of a lattice, interact in such a 
way that energy is lowest when neighbours are aligned, and a temperature pa- 
rameter to govern the extent of random fluctuations, it was eventually shown 
that, below a certain critical temperature (in two or more dimension s), sym- 
metry is spontaneously broken and most of the spins end up aligned fiBaxterl . 



19821 ). This ferromagnetic solution comes about and is then maintained be- 



cause it has a lower energy than any other configuration of spin s. Subsequent 
models, in particular that of [Sherrington and Kirkpatricki ( 119751 ). incorporated 



inhomogeneities in the coupling strengths such that there was no longer a con- 
figuration which simultaneously minimized all interaction energies, leading to 
frustrated states (spin-glasses). 

Th ese ideas were put together, by lAmaril (119721 ) and then by iHopfield 
( I982I ). in the first neural network models to exhibit the mechanism known as 
associative memory. Each model neuron was placed at the node of a network, 
originally assumed to be fully connected (all nodes connected to all the rest), 
and followed a dynamics which can be seen either as that of Ising spins or of 
McCulloch-Pitts neurons. However, a noise parameter usually referred to as 
"temperature" by analogy with spin systems could be included to allow for 
non-deterministic behaviour. By setting the interaction strengths (synaptic 
weights) not randomly, as in the Sherrington-Kirkpatrick model, but according 
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to the Hebb rule referred to above, information could be stored and retrieved 
by the system. More specifically, a set of particular patterns, or configurations 
of positive and negative elements (firing and non- firing neurons), are recorded 
in the following way: for each pattern, one looks at each pair of neurons and 
adds a quantity to the weight of the synapse joining them if the pattern in 
question requires them to be in the same state, and subtracts it when they 
should be opposite. In this way, the minimum energy configurations correspond 
to the stored patterns, which therefore become attractors of the dynamics: if 
the temperature is not too heigh to destroy all order, the system will evolve 
towards whichever of these patterns most resembles the initial configuration 
it is placed in. Figure 12.31 illustrates how this mechanism works for a system 
such that the firing and non-firing neurons represent black and white pixels of 
a bitmap image. 

Thanks to associative memory, if we were to store, say, a set of photos of 
various people and then "show" the network a different picture of one of the 
same subjects, it would be able to retrieve the correct identity. Not only is this 
mechanism used nowadays in technology capable of performing tasks such as 
pattern discrimination and classification, but it is widely considered to underlie 
our own capacity for learning and rec alling inform ation. There is evidence 



from neuronal readouts that this is so ( Amit 



19951 ). and not long ago, in vivo 



experiments finally established that learning is indeed related to the processes 
of long term potentiation (LTP) and depression (LTD) - by which synapses 
between neurons that fire nearly simultaneously gradually incre ase or decrease 



their conductan c e dep ending on the interval of time elapsed (IGruart et al. 



2006 



Roo et al. 



20081 



The neural network models studied nowadays generally include more real- 
istic dynamics both for the neurons and for t he synapses, taking into account 



a var iety of cellular and subcellular processes ( lAmit 



1989 



Torres and Varonal . 



2010h . For example, the fact that the conductance of synapses in reality de- 



pends on their workload has been found to enable a network to switch from 
one pattern to another - either spontaneously or as a reaction to sensory stim- 



uli - providing a means for the pe rformance of dynamic tasks fICortes et al. 



2006 



Holcman and Tsodvksl. 



20061): this result also seems to agree well with 



physiological data (IKorn and Faurd . l2003l ). In fact, there is evidence that the 
brain somehow maintains itself close to a boundary - called, in phy s ics, a 
critical point - between an ordered and a chaotic regime 



Chialvol . 



2004 



Chialvo et al. 



2008 



Bonachela et al. 



2010 



Iguiluz et al. 



2005 



Torres and Varona 
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Figure 2.3: In the Hopfield neural-network model, the interaction strengths 
(representing synaptic weights) store information in the form of particular 
patterns, or configurations of firing neurons, which become attractors of the 
dynamics. Whatever the initial state of the system, it will always evolve to- 
wards one of these patterns, thus allowing for the storage and retrieval of 
information. The mechanism, known as associative memory, is thought to be 
at the basis of memory in the brain. In this case, the network is "remember- 
ing" an illustration by Jean-Baptiste Oudry for Jean de la Fontaine's fable La 
Cigale et la Fourmi. 



20101 ). This would be in line with research that shows how certain useful prop- 
erties - such as the computat i onal capacity of some neural-network models 
f Bertschinger and Natschlagerl. 2004), or the dynamic range of sensitivity to 
stimuli in senso ry systems (|Kinouchi and Copellil . l2006l ) - are optimised at this 
"edge of chaos (IChialvd . 



200 



5 



That these models should actually reflect, albeit in an enormously simpli- 
fied way, what actually goes on in our brains tends to fit in quite well with 
intuitive expectations - to the extent that so-called connectionist models seem 
to be gradually beco ming the accepted paradigm in reley ant areas of psychol- 



ogy and philosophy (IMarcus and G.F. 



2001 



Frank. 



19971 ). For instance, from 
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this point of view the way in which the recollection of a particular detail often 
evokes, almost instantly, a whole landscape of sensations and emotions makes 
sense, since these concepts will have been stored in some way as the same pat- 
tern. Also, the fact that new memories are recorded in synapses which were 
already being used to store previous information would seem to explain why 
memories tend to fade slowly with time, yet can still be recalled, at least to 
some extent, when a particular thought in some sense overlaps with (reminds 
one of) one of them. When this happens, the old memory springs to mind and, 
if held there for long enough, can be refreshed via long term potentiation and 
depression - although interaction with other patterns or with current stimuli 
may well modify the refreshed information. Similarly, previous information 
influences the storing of new memories, leading to the well known fact that we 
tend to "see" things the way we expect them to be. 



It seems, then, that the basic mechanisms behind the ability of our brains 
to remember things, at least when the information is stored slowly enough for 
the biochemical processes of LTP and LTD to be at work (long-term memory), 
are now understood. Not only are the implications of such knowledge far- 
reaching in themselves. The way in which it was developed is also particularly 
notworthy. More or less sketchy ideas from areas as diverse as behavioural 
psychology, neurophysiology and theoretical physics were brought together in 
order to come up with a minimal mathematical model capable of manifesting 
the sought-after phenomenon of information retrieval as a consequence of the 
known properties of a great many simple elements. This kind of research can 
at first seem more like a mathematical game than anything to do with nitty- 
gritty reality. But the fact the basic mechanism of associative memory has 
since borne up to decades of experimenting and theoretical probing reveals 
how insightful it can actually be. It is hkely that other features of brain 
function - short-term memory, information processing or emotional tagging, 
to name but the first few that spring to mind - will eventually be thrown under 
a similar light. In fact, we can expect the nature of even such an elusive and 
intimate phenomenon as consciousness some day to become clear. After all, 
the explanations behind other emergent properties of matter which in their day 
seemed almost mystical, such as temperature or life, are now well understood. 
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2.3 A declaration of intent 

As Zora Neale Hurston put it, "Research is formahzed curiosity. It is poking 
and prying with a purpose." But there are many possible purposes, and even 
more different ways of poking and prying. The motivation behind the work 
presented here is to understand how the phenomena we observe in certain 
systems on a macroscopic scale can come about from interactions of their 
many relatively simple constituent elements. In the case of neural systems, it 
seems reasonable to assume that these basic elements are neurons, and that 
it is thanks to the cooperation of a great many of these cells that such organs 
are able to think and feel. The human brain - with about 100 billion neurons 
connected by 100 trillion synapses - being the most complex system we know 
of, an enormous degree of simplification will be required for our description to 
be of any use to this purpose. (In fact, if we could somehow simulate a brain 
in all detail, the result would be just as unfathomable as the original object, 
however exciting the activity may prove for other reasons.) The physiology of 
the neuron is nowadays quite well understood. However, just as the properties 
of atoms or transistors that are key to understanding phase transitions or the 
workings of a microchip are, respectively, magnetic interactions and voltage- 
dependent gating, we must try to ascertain exactly which neuronal features are 
necessary for the macroscopic behaviour we are interested in to occur. One 
way to do this is to start by considering only the most basic characteristics 
and explore what non-trivial phenomena emerge from these, allowing us then 
to add new ingredients one at a time to pinpoint the relevant ones. In this 
line, we consider large sets of Hopfield's simple binary model neurons to study 
how network properties are related to collective behaviour. 

This work is laid out as follows. Chapter [3] deals with development. The 
appearance and disappearance of edges in a network (growth and death of 



synapses, in the case of the brain) is forma 



studied in a general setting (jjohnson et al. 



ised a s a sto chastic process and 



2009b 



2010al ). It turns out that 



many of the topological features observed in experiments are well modelled in 
this way - which to some extent justifies, a posteriori, our initial assumptions. 
The following chapters describe particular phenomena that emerge as a di- 
rect consequence of some of those topological features: degree heterogeneity in 
conju nction with synaptic depression improves the performance of dynamical 



tasks ( I Johnson et al.l . l2008l ) ( Chap ter HI) ; assortativity serves to enhance a neu- 



ral network's robustness to noise fide Franciscis et al. 



20111) (Chapter E]); and 



clustering or modularity can lead to metastable states with certain properties 
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essential for some short-term mernory a bilities (properties hitherto lacking in 
previous models) ( 1 Johnson et al.l . 1201 if ) (Chapter [7]). Thanks to the extreme 
simplicity of the basic elements we are considering, we are able not only to 
simulate but also to understand mathematically how exactly the interesting 
phenomena emerge. This makes it possible to predict, to some extent, which 
extra ingredients will not invalidate the results if they are taken into account 
explicitly. 

Some of the work has a more general scope than the study of neural net- 
works. In particular, the equations obtained in Chapter [3] can be applied to any 
network that evolves under the influence of probabilistic addition and deletion 
of edges. And the method put forward in Chapter O for the study of correlated 
networks can be used not just for analysing particular models, as we go on 
to do in Chapter [6l but to solve many other problems - such as that of the 



ubiqui ty of disassortative networks in nature and technology ( jJohnson et al. 



2010b[ ). or how the property of nestedness typical of ecosystems is related to 



other topological characteristics (c.f. Appendix ICj) . 

To sum up, the aim of the thesis is to shed light on how cellular dy- 
namics can lead to the complex network structures of neural systems, 
and, in its turn, in what ways this topology can influence, optimise 
and determine the collective behaviour of such systems. 

The main contributions made are: 



• An analytical method to study the evolution of networks governed by a 
combination of local and global stochastic rules. 

• A mathematical and computational technique for the study of correlated 
networks in a model-independent way. 

• Possible biological justiflcations for two non-trivial features of the topol- 
ogy of the human cortex: heterogeneity of the degree distribution and 
high assort at ivity. 

• An answer to the long-standing question of why most networks are dis- 
assortative. 

• Cluster Reverberation: the flrst mechanism proposed which would allow 
neural systems to store information instantaneously in a robust manner. 



Chapter 3 



Evolving networks and the 
development of neural systems 

The highly heterogeneous degree distributions of most empirical networks is 
assumed in many cases to arise from some form of cummulative advantage, 
or preferential attachment. However, the origin of various other topological 
features is often not clear and attributed to specific functional requirements. 
We show how it is possible to analyse a very general scenario in which nodes 
gain or lose edges according to arbitrary functions of local and/or global degree 
information. Applying our method to two rather different examples of brain 
development - synaptic pruning in humans and the neural network of the worm 
C. Elegans - we find that simple biologically motivated assumptions lead to 
very good agreement with experimental data. In particular, many nontrivial 
topological features of the worm's brain arise naturally at a critical point. 



3.1 Introduction 



The conceptual simplicity of a network - a set of nodes, some pairs of which 
connected by edges - often suffices to capture the essence of cooperation in 
complex system s. Ever since Barabasi and Albert presented their evolving 
network model (IBarabasi and Albertl . 119991 ). in which linear preferential at- 
tachment leads asymptotically to a scale-free degree distribution (the degree, 
k, of a node being its number of neighbouring no des), there have beeri many 
variations or refinements to the original scenario 



G. Bianconi and Barabasi 



2001 



Park et al. 



2005 



2001 



Ree 



Krapivsky et al 



Albert and Barabasi 



2000 



2000; 



Bianconi and Barabasi 



20071) fse e also the review by 



Boccaletti et al. 



(120061 )). In Ref. (j Johnson et al.l . l2009bh . we show how topological phase tran- 
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sitions and scale-free solutions can emerge in the case of nonlinear rewiring 
i n fixed-size networks, and this work is summarized in Appendix |X1 In Ref. 
(jjohnson et al.l . l2010a[ l we extend our scope to more general and realistic situa- 
tions, considering the evolution of networks making only minimal assumptions 
about the attachment/detachment rules. In fact, all we assume is that these 
probabilities factorize into two parts: a local term that depends on node de- 
gree, and a global term, which is a function of the mean degree of the network. 
This is the work described in this chapter. 



Our motivation can be found in the mechanisms behind many real-world 
networks, but we focus, for the sake of illustration, on the development of bio- 
logical neural networks, where nodes r epresent neur o ns and edges play the part 



of sy naptic interaction f lAmit 



1989 



Sporns et all . 12004 : 



Torres and Varona 



20101). Experimental neuroscience has shown that e nhanced e 



induces synaptic growth and dendriti c arborizati o n (ILee et al. 



1997 



Klintsova and Greenough 



1999 



Roo et al. 



e ctric activity 



1980 



Frank 



20081 1. Since the activity 



of a neuron depends on the net current received from its neighbours, which 
tends to be higher the more neighbours it has, we can see node degree as a 
proxy for this activity - accounting for the local term alluded to above. On 
the other hand, synaptic growth and death also depend on concentrations of 
various molecules, which can diffuse through large areas of tissue and there- 
fore cannot in general be considered local. A feature of brain development 
in many animals is synaptic pruni ng - the large reduction in s ynaptic den- 
sity undergone throughout infancy. 



Chechik et al. 



(119991 . lin pressi ) have shown 



that via an elimination of less needed synapses, this can reduce the energy 
consumed by the brain (which in a human at rest can account for a quarter 
of total energy used) while maintaining near optimal memory performance. 
Going on this, we will take the mean degree of the network - or mean synaptic 
density - to reflect total ener gy consumption , hence the global terms in our 
attachment/detachment rules (j Johnson et al.l . l2009al ). 



An alternative approach would be to consider some kind of model neu- 
rons explicitly and couple the probabilities of synaptic growth and death to 
neuronal dynamic variables, such as local and global fields. In an Amari- 
Hopfield network, for example, the expected valu e of the field ( total incoming 



current) at node i is proportional to i's degree (ITorres et al 



2004f ). the to- 



tal current (energy consumption) in the network therefore being proportional 
to the mean degree; qualita tively, these obser vations are likely to hold also 
in more realistic situations ( jMagistretti . 2009 ). although relations need not 
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be linear. Co-evolving networks of this sort are currently attracting atten- 



tion, wi th dynamics such as Prisoner's Dilemma flPoncela et al.l. 12008 ), Vote r 



Model (j Vazquez et al.l . 120081 ) or Random Walkers ( lAntiqueira et al. 



20091 ). 



Although we consider this line of work particularly interesting, for generality 
and analytical tractability we opt here to use only topological information for 
the attachment/detachment rules, although our results can be applied to any 
situation in which the dynamical states of the elements at the nodes can be 
functionally related to degree^]. 

Following a brief general analysis, we show how appropriate choices of func- 
tions induce the system to evolve towards heterogeneous (sometimes scale-free) 
networks while undergoing synaptic pruning in quantitative agreement with 
experiments. At the same time, degree-degree correlations emerge naturally, 
thus making the resulting networks disassortative - as tends to be the case for 
most biological networks - and leading to realistic small-world parameters. 



3.2 Basic considerations 

Consider a simple undirected network with nodes defined by the adjacency 
matrix a, the element aij representing the existence or otherwise of an edge 
between nodes i and j. Each node can be characterized by its degree, ki = 
Initially, the degrees follow some distribution p{k,t = 0) with mean 
K{t). We wish to study the evolution of networks in which nodes can gain or 
lose edges according to stochastic rules which only take into account local and 
global information on degrees. So as to implement this in the most general 
way, we will assume that at every time step, each node has a probability of 
gaining a new edge, pS^^m^ ^ random node; and a probability of losing a 
randomly chosen edge, P^^. We assume these factorize as = u{K)7r{ki) 

and pj'^^^ = d{K)a{ki), where u, d, n and a can be arbitrary functions, but 
impose nothing else other than normalization. 

For each edge that is withdrawn from the network, two nodes decrease in 
degree: i, chosen according to cr{ki), and j, a random neighbour of i's; so 
there is an added effective probability of loss kj/ (kN). Similarly, for each edge 
placed in the network, not only / chosen according to iT{ki) increases its degree; 



^For instance , the stationary distribution of walkers used for edge dynamics by 
(|2009l) is actually obtained purely from topological information, although 



Antiaueira et al. 



it can only be written in terms of local degrees for undirected networks. 
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a random node m will also gain, with the consequent effective probability 
A^""*^ (though se^. Let us introduce the notation 7T{k) = 7c{k) + A^"-*^ and 
cr(A;) = a(k) + k/inN). N etwork evolution can now be seen as a one step 
process ( van KamperJ . 19921 ) with transition rates u{K)7r{k) and d{K)a{k). The 
expected value for the increment in a given p{k, t) at each time step (which we 
equate with a temporal derivative) defines a master equation for the degree 
distribution (j Johnson et all l2009bl ) : 



dp{k,t) 
dt 



u {k) ^{k - l)p{k -l) + d {k) d{k + l)p{k + 1) 



[u (k) ■K{k) + d (k) a{k)] p{k, t). 



(3.1) 



Assuming now that p{k,t) evolves towards a stationary distribution, Pst{k), 
then this must neces sarily satisfy detailed balance since it is a one step process 



(Ivan Kampen 



19921 ): i.e., the flux of probability fro m k to k + 1 must equal the 



flux from k + 1 to k, for all k (iMarro and DickmanI ). This condition (sufficient 
for Eq. ( 13. ip to be zero) can be written as 



dk 



d{Kst) a{k + l) 



Pstik), 



(3.2) 



where we have substituted a difference for a partial derivative and Kgt = 
J2k ^Pst(^)- Setting 71 and a so as to be normalized to one (i.e., = 
J2kP(k)'^(k) ~ '^^)' which is equivalent to saying that at each time step ex- 
actly u{k,) nodes are chosen to gain edges and d{K,) to lose them, then in the 
stationary state we will have u^Hgi) = d^n^^) since the total number of edges 
will be conserved. From Eq. ( 13. 2p we can see that Pgt(fe) will have an extremum 
at some value kg if it satisfies tt (/cg) = cr(fce + 1). k^ will be a maximum (mini- 
mum) if the numerator in Eq. (13. 2p is smaller (greater) than the denominator 
for > fcg, and viceversa for k < k^.. Assuming, for example, that there is 
one and only one such fee, then a maximum implies a relatively homogeneous 
distribution, while a minimum means Psi{k) will be split in two, and therefore 
highly heterogeneous. More intuitively, if for nodes with large enough k there 
is a higher probability of gaining edges than of losing them, the degrees of 
these nodes will grow indefinitely, leading to heterogeneity. If, on the other 
hand, highly connected nodes always lose more edges than they gain, we will 



^We are ignoring the small corrections that arise because j ^ i and I ^ m, which in any 
case would disappear if self-connections were allowed. 
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obtain quite liomogeneous networks. From this reasoning we can see that a 
particularly interesting case (which turns out to be critical) is that in which 
7r(fc) and are such that 

7r{k) = a{k) = v{k), 'ik. (3.3) 

According to Eq. fl3.2p . Condition f l3.3p means that for large fc, dpst{k) / dk — >■ 
0, and Pst{k) flattens out - as for example a power-law does. 

The standar d Fokker-Planck app roximation for the one step process defined 



1 ne stanaar g ■boKKer-riancK app 
by Eq. ([31]) is Jvan Kampenl . Il992l ) 

dp{k,t) 1 



dt 2 5P MK)a{k) + u{K)n{k)] p{k, t)} 

(3.4) 

+ ^ {[diK)a{k) - uiK)7r{k)]pik,t)} . 
For transition rates which meet Condition (13.31) . Eq. (13. 4p can be written as: 

^^ = l[d{n)+u{K)]^^[v{k)p{k,t)] 

(3.5) 

+ [d{K) -u{K)]-^[v{k)p{k,t)]. 

Ignoring boundary conditions, the stationary solution must satisfy, on the one 
hand, v{k)pgf^{k) = Ak + B, so that the diffusion is stationary, and, on the 
other, M(Kst) = d^Hgi), to cancel out the drift. For this situation to be reach- 
able from any initial condition, u{k) and d{K) must be monotonous functions, 
decreasing and increasing respectively. 



3.3 Synaptic pruning 

As a simple example, we will first consider global probabilities which have the 
linear forms: 

u[n{t)] = -[l-^^] and d[K{t)] = -^^, (3.6) 

where n is the expected value of the number of additions and deletions of edges 
per time step, and Kmax is the maximum value the mean degree can have. This 
choice describes a situation in which the higher the density of synapses, the less 
likely new ones are to sprout and the more likely existing ones are to atrophy 
- a situation that might arise, for instance, in the presence of a finite quantity 
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of nutrients. Again taking into account that vr and a are normalized to one, 
summing over Pf — pP^^ we find that the expected increment in K{t) is 



' At 



) = 2{nKt)]-rfKt)]} 



n 



1 - 2 



'^max 



(independently of the local probabilities). Therefore, the mean degree will 
increase or decrease exponentially with time, from k{0) to ^^max- Assuming 
that the initial condition is, say, k,{0) = /tmax, and expressing the solution in 
terms of the mean synaptic density - i.e., p{t) = K{t)N/{2V), with V the total 
volume considered - we have 



p{t) = Pf (l + e 



(3.7) 



where we have defined pf = p{t — t- oo) and the time constant for pruning is 
Tp = p^N/n. This equation was fitted in Fig,. 13.11 to experimental data on 
layers 1 and 2 of the human auditory cortexo obtained during autopsies by 
Huttenlocher and Dabholkai f IQQtI ). 

It seems reasonable to assume that the initial overgrowth of synapses is due 
to the transient existence of some kind of growth factors. If we account for 
these by including a nonlinear, time- dependent term g{t) = aexp(— t/rg) in 
the probability of growth, i.e., u[K{t),t] = {n/N)[l — K{t)/KinaLK + g{t)], leaving 
d[K{t)] as before, we find that p{t) becomes 



pit) = Pf 



1 + e" 



l + e-*o/-p\e -g 



_t-to 



{3.i 



where to is the time at which synapses begin to form [t = corresponds to 
the moment of conception) and rg is the time constant related to growth. The 
inset in Fig. 13.11 shows the best fit to the auditory cortex data. Since the 
contour conditions pf and (for Eq. fl3.8p ) to are simply taken as the value 
of the last data point and the time of the first one, in each case, the time 
constants rp and rg are the only parameters needed for the fit. 



3.4 Phase transitions 

The drift-like evolution of the mean degree we have just illustrated with the 
example of synaptic pruning is independent of the local probabilities iT{k) 



■^Data points for three particular days (smaller symbols) are omitted from the fit, since 
we believe these must be from subjects with inherently lower synaptic density. 
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Conceptual age (days) 



Figure 3.1: Synaptic densities in layers 1 (red squares) and 2 (black cir- 
cles) of the human auditory c ortex against time from conception. Data from 



Huttenlocher and Dabholkarl (119971 ). obtained by directly counting synapses in 



tissues from autopsies. Lines follow best fits to Eq. (13. 7p . where the param- 
eters were: for layer 1, rp = 5041 days; and for layer 2, rp = 3898 days (for 
we have used the last data pints: 30.7 and 40.8 synapses//im^, for layers 1 
and 2 respectively). Data pertaining to the first year and to days 4700, 5000 
7300, shown with smaller symbols, where omitted from the fit. Assuming the 
existence of transient growth factors, we can include the data points for the 
first year in the fit by using Eq. (13. 8p . This is done in the inset (where time 
is displayed logarithmically). The best fits were: for layer 1, rg = 151.0 and 
Tp = 5221; and for layer 2, rg = 191.1 and rp = 4184, all in days (we have 
approximated to to the time of the first data points, 192 days). 



and a{k). The effect of these is rather in the diffusive behaviour which can 
lead, as mentioned, either to homogeneous or to heterogeneous final states. 
A useful bounded order parameter to characterize these phases is therefore 
m = exp(— ct^/k^) , where o"^ = (/c^) — is the variance of the degree dis- 
tribution ((■) = N~^^-{-) represents an average over nodes). We will use 
= limt^oo '"^(^) to distinguish between the different phases, since mg^ = 1 
for a regular network and mg^ — )■ for one following a highly heterogeneous 
distribution. Although there are particular choices of probabilities which lead 
to Eq. (13. 5p . these are not the only critical cases, since the transition from 
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Figure 3.2: Evolution of the degree distributions of networks beginning as 
regular random graphs with k{0) = 20 in the critical (top) and supercritical 
(bottom) regimes. Local probabilities are o"(A;) = k/{{k)N) in both cases, and 
7r{k) = 2a{k)—N^^ and ^{k) = k^^'^ / {{k^^'^) N) for the critical and supercritical 
ones, respectively. Global probabilities as in Eq. (13. 6p . with n = 10 and 
^max = 20. Symbols in the main panels correspond to p{k, t) at different times 
as obtained from MC simulations. Lines result from numerical integration of 
Eq. (13. ip . Insets show typical time series of k, and m. Light blue lines are 
from MC simulations and red lines are theoretical, given by Eq. (13. 7p and Eq. 
(IXTj) . respectively. = 1000. 



homogeneous to heterogeneous stationary states can come about also with 
functions which never meet Condition (13. 3p . Rather, this is a classic topolog- 
i cal phase transition, the nature of which depends on the choice o f functions 
(jPark and Newmanl . 12004 : 



Burda et al. 



2004 berenvi et all . l2004h . 



Evolution of the degree distribution is shown in Fig. 13.21 for critical and 
supercritical choices for the probabilities, as given by MC simulations (starting 
from regular random graphs) and contrasted with theory. (The subcritical 
regime is not shown since the stationary state has a distribution similar to 
the ones at t = 10^ MCS in the other regimes.) The disparity between the 
theory and the simulations for the final distributions is due to the build up of 
certain correlations not taken into account in our analysis. This is because the 
existence of some very highly connected nodes reduces the probability of there 
being very low degree nodes. In particular, if there are, say, x nodes connected 
to the rest of the network, then a natural cutoff, kmin = x, emerges. Note 
that this occurs only when we restrict ourselves to simple networks, i.e., with 
only one edge allowed for each pair of nodes. This topological phase transition 
is shown in Fig. 13. 3[ where m^^ is plotted against parameter a for global 
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0.5 1 1.5 2 



a 

Figure 3.3: Phase transitions in mg^ for 7r(A;) ~ fc" and cr{k) ~ k, and u{k) 
and d{K) as in Eq. fl3.6p . = 1000 (blue squares), 1500 (red triangles) and 
2000 (black circles); k(0) = Kmax = 2n = N/50. Corresponding lines are from 
numerical integration of Eq. (13. ip . The bottom left inset shows values of the 
highest eigenvalue of the Laplacian matrix (red squares) and of Q = \n/^2 
(black circles), a measure of unsynchronizability; = 1000. The top right 
inset shows transitions for the same parameters in the final values of Pearson's 
correlation coefficient r (see Section 13751) . both for only one edge allowed per 
pair of nodes (red squares) and without this restriction (black circles). 



probabilities as in Eq. (13. 6p and local ones 7r{k) ~ A;" and cr(fc) ~ k. This 
situation corresponds to one in which edges are eliminated randomly while 
nodes have a power-law probability of sprouting new ones (note that power- 
laws are good descriptions of a variety of monotonous response functions, yet 
only require one parameter). Although, to our knowledge, there are not yet 
enough empirical data to ascertain what degree distribution the structural 
topology of the human brain follows, it is worth noting that its functional 
topology, at the level of brain areas, has been found to be scale-free with an 
exponent very close to 2 flEguiluz et al.l . l2005l ) . 

In general, most other measures can be expected to undergo a transition 
along with its variance. For instance, highly heterogeneous networks (such 
as scale-free ones) exhibit the small-world property, characterized by a high 
clustering coefficient, C ^ {k)/N, and a low mean minimum path, I ~ In(A^) 
Jwatts and Stroeatzl . llQQsh . A particularly interesting topological feature of 
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a network is its synchronizability - i.e., given a set of oscillators placed at 
the nodes and coupled via the edges, how wide a range of coupling strengths 
will result in them all becoming synchronized. Barahona and Pecora showed 
analytically that, for linear oscillators, a network is more synchronizable the 
lower the relation Q = ^ where Xn and A2 are the highest and lowest 

non-zero eigenvalues o f the L aplacian matrix (Ajj = 6ijki — ciij), respectively 
( Barahona and Pecoral . I2OO2 ). The bottom left inset in Fig. 13.31 displays the 
values of Q and A^v obtained for the different stationary states. There is a 
peak in Q at the critical point. It has been argued that this tendency of 
heterogeneous topologies to be particularly unsynchronizable poses a paradox 
given the wide prevalence of scale-free networks in nature, a problem that has 



been d eftly got around by considering appropr iate weighting schemes for the 

Chavez et al.l . 2005 ) (see alsc0, and the review by 



edges (iMotter et al. 



Arenas et al. 



2005 



fl2008al )). However, there is no generic reason why high synchro- 
nizability should always be desirable. In fact, it has recently been shown that 
heterogeneity can improve the dynamical performance of mo del neural net- 



work s precisely because the fixed points are easily destabilised (jjohnson et al. 



20081 ) (as well as conferring robustne ss to thermal fiuctuations and improving 



storage capacity (ITorres et al.l . 120041 )). This makes intuitive sense, since, pre- 
sumably, one would not usually want all the neurons in one's brain to be doing 
exactly the same thing. Therefore, this point of maximum unsynchronizability 
at the phase transition may be a particularly advantageous one. 

On the whole, we find that three classes of network - homogeneous, scale- 
free (at the critical point) and ones composed of starlike structures, with a 
great many small-degree nodes connected to a few hubs - can emerge for any 
kind of attachment/detachment rules. It follows that a network subject to 
some sort of optimising mechanism, such as Natural Selection for the case of 
living systems, could thus evolve towards whichever topology best suits its 
requirements by tuning these microscopic actions. 



3.5 Correlations 



Most real networks have been 

known as mixing by degree flPastor-Satorras et al 



bund to exhibit degre e -degree correlati o ns, als o 



2001 



Newman 



2003d) 



^Using pacemaker nodes , scale-free networks have a lso been shown to emerge via rules 
which maximize synchrony (|Sendina-Nadal et al.l . 120081) . 
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They can thus be classified as assortative, when the degree of a typical node 
is positively correlated with that of its neighbours, or disassortative, when 
the correlation is negative. This property has important impl ications f o r net 



work characteristics such as connectedness and robustness (iNewmanl . |2002| . 



2003al ). A useful measure o f this phenomeno n is Pearson's correlation coef- 



ficient applied to the edges flNewmanl . 



2003 



iBoccaletti et al. 



20061): 



{\kik'j\ — [kif)/{\kf] — [kif'), where ki and k[ are the degrees of each of the two 



nodes pertaining to edge Z, and 
over edges; |r| < 1. Writing Yliii') 
averages over nodes: 



= {{k)N) represents an average 

Ylij ^ can be expressed in terms of 



2\2 



(3.9) 



where knnik) is the mean nearest-neighbour-degree function; i.e., if k^n^i = 
k~^ ttijkj is the mean degree of the neighbours of node i, knnik) is its average 
over all nodes such that ki = k. Whereas most social networ ks are assortative 
(r > 0) - due, probably, to mechanisms such as homophily ({Newman, 2003c) 
- almost all other networks, whether biological, technological or information- 
related, seem to be generically disassortative. The top right inset in Fig. 13.31 
displays the stationary value of r obtained in the same networks as in the main 
panel and lower inset. It turns out that the heterogeneous regime is disassorta- 
tive, the more so the larger a. (Note that a completely homogeneous network 
cannot have degree-degree correlations, since all degrees are the same.) It is 
known that the r estriction of having at most one edge per pair o f nodes induces 



disassortativity fjPark and Newman 



2003 



Maslov et al 



2004J ). However, in 



our case this is not the sole origin of the correlations, as can also be seen in 



the same inset of Fig. 13. 3[ where we have plotted r for networks in which we 
have lifted the restriction and allowed any number of edges per pair of nodes. 
In fact, when multiple edges are allowed, the correlations are slightly stronger. 
As we shall discuss in Chapter [5l there is a general entropic reason for hetero- 
geneous networks, in their equilibrium s tate (i.e., in th e absen ce of correlating 
mechanisms), to become disassortative (j Johnson et al.l . l2010bl ). But neither is 
this here the case, since the networks generated are driven from equilibrium 
by the mechanisms of preferential attachment and detachment. 

To understand how these specific correlations come about, consider a pair 
of nodes which, at a given moment, can either be occupied by an edge or 
unoccupied. We will call the expected times of permanence for occupied and 
unoccupied states r° and r-^, respectively. After sufficient evolution time (so 
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that occupancy becomes independent of the initial stat^, the expected value 
of the corresponding element of the adjacency matrix, E{aij) = iij, will be 

e — 



If (pjj) is the probability that (z, j) will become occupied (unoccupied) given 
that it is unoccupied (occupied), then r° ~ l/Pi^- and t^- ~ l/p^^, yielding 




Taking into account the probability that each node has of gaining or losing an 
edge, we obtaiifl: pj = u{l^k))N-\T^{ki) + 7r(/cj)] and pr. = d{l^k))\o{}zi)lki + 
a{kj)lk^. Then, assuming that the network is sparse enough that p^- ^ 
(since the number of edges is much smaller than the number of pairs), and 
particularising for power-law local probabilities nilz) ~ fc" and oik) ~ k^ ^ the 
expected occupancy of the pair is 



p-. d{{k)){k-)N\kt' + k^-' 

Considering the stationary state, when u{{k)) = d{{k)), and for the case of 
random deletion of edges, /3 = 1 (so that the only nonlinearity is due to a), 
the previous expression reduces to 

(Note that this matrix is not consistent term by term, since Ylj ^ij 7^ ki, 
although it is globally consistent: '^ijhj = {k)N.) The nearest-neighbour- 
degree function is now 

knn{ki) = — "^^hjkj = 2^ L i.{k)ki ^ + {k°'^^)k^ ^) 

(a decreasing function for any a), with the result that Pearson's coefficient 
becomes 

■{kr{k-^')-{kr{k'^)\ ^3_^^^ 



(A;-) V {k){k-^) - {k^y 



^Note that this wih always happen eventually since the process is ergodic. 

^ Again, we are ignoring corrections due to the fact that i is necessarily different from j. 
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More generally, one can understand the emergence of these correlations in 
the following way. For the network to become heterogeneous, we must have 
iT{k) + A^~^ > cT{k) + k/{{k)N) for large enough k, so that highly connected 
nodes do not lose more edges than they can acquire (see Section [X^ . This 
implies that TT{k) must be increasing and approximately linear or superlinear. 
The expected value of the degree of a node i, chosen according to 7r(fcj), is then 
E{ki) = Ylk'^W^ ~ while that of its new, randomly chosen 

neighbour, j, is only E{kj) = (k). This induces disassortative correlations 
which can never be compensated by the breaking of edges between nodes whose 
expected degree values are A^~^ J2k c"(^)^ and if a{k) is an increasing 

function. It thus ensues that a scenario such as the one analysed in this paper 
will never lead to assortative networks except for some cases in which (T{k) 
is a decreasing function - meaning that less connected nodes should be more 
likely to lose edges. Assortativity could, however, arise if there were some bias 
also on the node chosen to be z's neighbour, e.g. on the postsynaptic neuron 
- which is precisely what happens in most social networks, where individuals 
do not generally choose their friends, partners, etc. randomly. Although there 
seem to be other reaso ns for the ubiquity of disassortative networks in nature 
(j Johnson et al.l . l2010bl ). it is possible that the generality of the scenario studied 
here may also play a part. 

We can use the expected value matrix e to estimate other magnitudes. 
For example, the clusterin g coefficient, as defined by Watts and Strogatz 
(jwatts and Strogatz . 1998 ). is an average over nodes of Cj, with Cj the pro- 
portion of z's neighbours which are connected to each other; so its expected 
value is E{Ci) = iji conditioned to j and / being neighbours of i's. This means 
that, on average, we can make the approximation that 

k, = h = = ^^m{k--') + {k'^^'){k-')]. 

Substituting this value in Eq. fl3.10p . and taking into account that one edge 
of j's and one of Vs are taken up by we have 

For a rough estimate of the mean minimum path (the minimum path between 
two nodes being the smallest nu mber of edges one has to fol low to get from one 
to the other), we can proceed as Albert and Barabasi ( 20021 ). For a given node, 
let us define the number of nearest neighbours, Zi, next-nearest neighbours, 
Z2, and in general mth neighbours, Zm- Using the relation Zm = zi {z2/zi)"^~^ , 
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and assuming that the network is connected and can be obtained in / steps, 
this yields 

I 

l + J2^m = N. (3.13) 

1 

On average, zi = (k) and Z2 = {k)[{l — C){knn) — 1] (since for each second 
nearest neighbour, one edge goes to the reference node and a proportion C to 
mutual neighbours). Now, if N zi and Z2 ^ zi, Eq. fl3.13p leads to 

ln(A^/(A;)) 

'-' + ln[(l-C)(fc„„)-l]- ^'-'^^ 

3.6 The C. Elegans neural network 

There exists a biological neural network which has been entirely mapped 
(although not, to the best of our knowledge, at different stages of deyelop- 



ment) - that of the much -investigated worm C. Elegans ( iWhite et al 



1986 



Watts and Strogatzl . 119981 ). With a view to testing whether such a network 
could arise via simple stochastic rules of the kind we are here considering, we 
ran simulations for the same number of nodes, N = 307, and (stationary) mean 
degree, (k) = 14.0 (in the simple, undirected representation of the network). 
Using the global probabilities given by Eq. fl3.6p and local ones 7r(/c) ~ k"' 
and cr{k) ~ k (as in Fig. 13. 3p . we obtain a surprising result. Precisely at the 
critical point, a = etc — 1.35, there are some remarkable similarities between 
the biological network and the ones produced by the model. 

Figure 13.41 displays the degree distributions, both for the empirical net- 
work and for the average (stationary) simulated network corresponding to the 
critical point, while the top inset shows the mean-nearest-neighbour degree 
function fc„„(A;) for the same networks. Both p{k) and knn{k) of the simulated 
networks can be seen to be very similar to those measured in the biological one. 
Furthermore, as is displayed in Table 13711 the clustering coefficient obtained in 
simulation is almost the same as the empirical one. The mean minimum path is 
similar though slightly smaller in sim ulation, probably due to the worm's brain 



having modules related to functions ( lArenas et al.l . l2008bl ) . Finally, Pearson's 



coefficient is also in fairly good agreement, although the simulated networks 
are actually a bit more disassortative. It should, however, be stressed that the 
simulation results are for averages over 100 runs, while the biological system is 
equivalent to a single run; given the small number of neurons, statistical fluc- 
tuations can be fairly large, so one should refrain from attributing too much 
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Figure 3. 4: Degree distrib ution (binned) of the C. Elegans neural network 
(circles) ( iWhite et al.l . and that obtained with MC simulations (line) in 

the stationary state (t = 10^ steps) for an equivalent network in which edges 
are removed randomly (/3 = 1) at the critical point {a = 1.35). = 307, = 
14.0, averages over 100 runs. Global probabilities as in Eq. (13. 6p . The slope is 
for A;~^/^. Top right inset: mean-neighbour-degree function knnik) as measured 
in the same empirical network (circles) and as given by the same simulations 
(line) as in the main panel. The slope is for Bottom left inset: 

of equivalent network for a range of a, both from simulations (circles) and as 
obtained with Eq. (13. ip . (See also Table EHJ) 



importance to the precise values obtained - at least until we can average over 
100 worms. Table 13.11 also shows the values of C, I and r both as estimated 
form the theory laid out in Se c tion 13.51 and for the equivalent network in the 
configuration model (INewmanl . l2003d ) - generally taken as the null model for 



heterogeneous networks, where the probability of an edge existing between 
nodes i and j is kikj/{{k)N). It is clear that whereas the configuration- mo del 
predictions deviate substantially from the magnitudes measured in the C. El- 
egans neural network, the growth process we are here considering accounts for 
them quite well. It is interesting that it should be at the critical point that a 
structural topology so similar to the empirical one emerges, since it se ems that 



the brain's functional 



2004 



Chialvo et al. 



topo logy may also be related to a critical point (IChialvd . 



20081). 
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Experiment 


Simulation 


Theory 


Config. 


c 


0.28 


0.28 


0.23 


0.15 


I 


2.46 


2.19 


1.86 


1.96 


r 


-0.163 


-0.207 


-0.305 


-0.101 



Table 3.1: Values of small-world parameters C and /, and Pearson's correla- 
tion coefficient r, a s measured in the neural network of the worm C. Elegans 



(jWhite et al.l . Il986l ). and as obtained from simulations in the stationary state 
[t = 10^ steps) for an equivalent network at the critical point when edges are 
removed randomly - i.e., for a = 1.35 and /3 = 1. = 307, = 14.0; 
averages over 100 runs and global probabilities as in Eq. (13. 6p . Theoretical 
estimates correspond to Eqs. (I3.12p . fl3.14p and (13.110 applied to the networks 
generated by the same simulations. The last column lists the respe ctive con- 
fiQura tion model values: C and / are obtai ned theoretica l ly as in (INewmanl . 



2003d), while r, from MC simulations as in (IMaslov et al. 



2004h . is the value 



expected due to the absence of multiple edges. (See also Fig. 13. 4[ ) 



3.7 Discussion 



With this work we have attempted, on the one hand, to extend our under- 
standing of evolving networks so that any choice of transition probabilities 
dependent on local and/or global degrees can be treated analytically, thereby 
obtaining some model-independent results; and on the other, to illustrate how 
such a framework can be applied to realistic biological scenarios. For the latter, 
we have used two examples relating to two rather different nervous systems: 

i) synaptic pruning in humans, for which the use of nonlinear global prob- 
abilities reproduces the initial increase and subsequent depletion in synaptic 
density in good accord with experiments - to the extent that nonmonotonic 
data points spanning a lifetime can be very well fitted with only two parame- 
ters; and 

ii) the structure of the C. Elegans neural network, for which it turns out that 
by only considering the numbers of nodes and edges, and imposing random 
deletion of edges and power-law probabiUty of growth, the critical point leads 
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to networks exhibiting many of the worm's nontrivial features - such as the 
degree distribution, small-world parameters, and even level of disassortativity. 

These examples indicate that it is not far-fetched to contemplate how 
many structural features of the brain or other networks - and not just the 
degree distributions - could arise by simple stochastic rules like the ones con- 
sidered; although, un doubtedly, other ingredients suc h as n atural modularity 
(lArenas et al.l. l2008b[). a r aetric (IKaiser and Hilgetagl . |200J) or functional re- 
quirements (Isporns et al. . 2004 ) can also be expected to play a role in many 
instances. We hope, therefore, that the framework laid out here - in which 
for simplicity we have assumed the network to be undirected and to have a 
fixed size, although generalizations are straightforward - may prove useful for 
interpreting data from a variety of fields. It would be particularly interesting 
to try to locate and quantify the biological mechanisms assumed to be behind 
this kind of network dynamics. 



Chapter 4 



Bringing on the Edge of Chaos 
with heterogeneity 

The collective behaviour of systems of coupled excitable elements, such as 
neurons, has been shown to depend significantly on the heterogeneity of the 
degree distribution of the underlying network of interactions. For instance, 
broad - in particular, scale- free - distributions have been found to improve 
static memory performance in neural- network models. Here we look at the 
influence of degree heterogeneity in a neural network which, due to the effect 
of synaptic depression (a kind of fatigue of the interaction strengths), exhibits 
chaotic behaviour. Not only can the existence of a chaotic phase be related 
to neurophysiological experiments; it allows the system to perform a class 
of dynamic pattern- recognition tasks. We find first of all that, as has been 
described in a few other systems, optimal performance is achieved close to the 
phase transition - i.e., at the so-called Edge of Chaos. Furthermore, we obtain 
a functional relationship between the level of synaptic depression required to 
bring on chaos and the heterogeneity of the degree distribution. This result 
points to a clear advantage of low-exponent scale-free networks, and suggests 
an explanation for their apparent ubiquity in certain biological systems. 



4.1 Exciting cooperation 



Excitable systems allow for the regeneration of waves propagating through 
them, and may thus respond vigorously to weak stimulus. The brain and other 
parts of the nervous system are well-studied paradigms, and forest fires with 
constant ignition of trees and a utocatalyt i c reac t ions in surfac e s, for instance, 



also share some of the basics (IBak et al. 
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1990 



Meronl . 



1992 



Lindner et al. 
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2004; 


Izhikevich . 


2007; 


Arenas et al.. 


2008a) 



2008aJ). The fact that signals are not 



gradually damped by friction in these cases is a consequence of cooperation 
among many elements in a nonequilibrium setting. These systems can be 
seen as large networks of nodes that are "excitable". This admits various 
realizations, but typically means that each element has a threshold and a 
refractory time between consecutive responses - a behaviour that impedes 
thermal equilibrium. 

Some brain tasks can be simulated with mathematical neural networks. As 
described in Chapter |2l these consist of neurons - often modelled as variables 
which are as simple as possible while still able to display the essence of the 
coopera t ive b e 



flAmari 



1972 



Hopfield 



1982 



laviour of intereslEI - c onnec t ed by edges representing 



Amit 



1989 



Torres and Varona 



synapses 



20101). If the 



edges are we ighted according to some prescription - such as the Hebb rule 
(iHebbl . Il949l ) - which saves information from a set of given patterns of activ- 
ity (particular configurations of active and inactive neurons), these patterns 
become attractors of the phase-space dynamics. Therefore, the system is then 
able to retrieve the stored patterns; this mechanism is known as associative 
memory. Actual neural systems do much more than just recalling a memory 
and staying there, however. That is, one should expect dynamic instabilities 
or some other destabilizing mechanism. This expectation is reinforced by re- 
cent experiments suggesting that synaps es undergo rapid ch a nges with time 



which may both determine brain tasks (lAbbott et al 



1998 



Hilfiker et al 



1999 



haps chaotic activity (IBarrie et al 



Pantic et al 



1996 



1997 



Tsodyks et al 



20021) and induce irreg ular and per- 



Korn and Faure 



20031 ) 



One may argue that the observed rapid changes (which have been found 
to cause " synaptic depression" and/or "facilitation" on the time scale of mil- 



liseconds f lTsodvks et al. 



1998 



Pantic et al. 



20021) - i. e., much faster than 



the p lasticity processes whereby synapses store patterns flMalenka and Nicolll . 
19991 )) may simply correspond to the characteristic behaviour of single ex- 
citable elements. Furthermore, a fully-connected network which describes co- 
operation among such excitable elements has recently been s hown to exhibit 
both attractors and chaotic i nstabilities fjMarro et al.l . 120081 ). The work de- 
scribed here, first reported by I Johnson et al.l (120081 ). extends and generalizes 



this study to conclude on the infiuence of the excitable network topology on 



^Several studies have already shown that binary neurons c an capture the ess ence of 
cooperation in many more complex settings. See, for instance, (jPantic et al.l . 120021 ) in the 
case of integrate and fire neuron models of pyramidal cells. 
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dynamic behaviour. We show, in particular, an interesting correlation between 
certain wiring topology and optimal functionality. 



4.2 The Fast-Noise model 

Consider binary nodes (sj = ±1) and the adjacency matrix, aij = 1, 0, which 
indicates the existence or not of an edge between nodes i,j = 1,2,...,N. Let 
there be a set of M patterns, C,i = ±1, = 1, ---M (which we generate here at 
random) , and assume that they are "stored" by giving each edge a base weight 
ZJ^ = Actual weights are dynamic, however, namely, Uij = uJiJXj 

where Xj IS 8b stochastic variable. Assuming the limit in which this varies in a 
time scale infinitely smaller than the one for node dynamics, we can consider 
a stationary distribution such as P{xj\S) = q6{xj — Ej) + (1 — q)6{xj — 1), 
5* = {sj} , for instance. This amounts to assuming that, at each time step, 
every connection has a probability q of altering its weight by a factor which 
is a function (to be determined) of the local field at j, defined as the net 
current arri ving to ? from othe r nodes. This choice differs essentially from the 



one used by iMarro et al.l (120081 ) , where q depends on the global degree of order 
and Ej is a constant independent of j. 

Assume independence of the noise at different edges, and that the transition 
rate for the stochastic changes is 

c{S^S')_ -j-j- J dxjP{xj\S)^{uij) 
c{S'^S) ~ 11 Jdx,P{x,\S^)^{-u.,y 

where Uij = SiSjXjUJijT^^, '^(u) = exp {—^uj to have proper contour condi- 
tions, r is a "temperature" or stochasticity parameter, and S** stands for S 
after the ch ange Sj — )■ —Sj. (This formalism and its interpretation is described 



in detail by iMarro and Dickmanl .) We define the effective local fields hf^ = 
hf{S,T,q) via UjVij/^tj = exp {-hfsi/T) , where ipfj = qexp{±EjVij) + 
(1 — g) exp (±t>jj), with Vij = \dijUij. Effective weights ulf then follow from 
h^^ = '^j^ifsjdij. To obtain an analytical expression, we linearize around 
uJij = (a good approximation when M <^ A^), which yields 

utf=[l + qiE,-l)]uJ-. 



In order to fix Sj here, we first introduce the overlap vector r/t 
with m" = N^'^ Yli^i^iy which measures the correlation between the current 
configuration and each of the stored patterns, and the local one of com- 
ponents = {k)~^ Ylii^i ^I'^jh where {k) is the mean node connectivity, i.e.. 
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the average of ki = aij. We then assume, for any g 7^ 0, that the relevant 
factor is = I + C(^^)('^ - 1)/?, with 



where x = ^/ (k) and a > is a parameter. This comes from the fact that 
the field at node j can be written as a sum of components from each pattern, 
namely, hj = /ij, where 

i 

Our choice for Sj, which amounts to assuming that the "fatigue" at a given 
edge increases with the field at the preceding no d e ?' (a nd allows to recover the 

if a = 2), finally leads 



Marro et al. 



fully-connected limit described by 
to 

< =[1 + (<I>- 1)0(7^)] cl^. 

Varying $ one sets the nature of the weights. That is, < $ < 1 corresponds 
to resistance (depression) due to heavy local work, while the edge facilitates 
- i.e., tends to increase the effect of the signal under the same situation - 
for $ > 1. (The action of the edge is reversed for negative $.) We performed 
Monte Carlo simulations using standard parallel updating with the effective 
rates c (S* — > S") computed using the latter effective weights. 



4.3 Edge of Chaos 

It is possible to solve the single pattern case (M = 1) under a mean- field as- 
sumption, which is a good approximation for large enough connectivity. That 
is, we may substitute the matrix dij by its mean value over network realizations 
to obtain analytical results that are independent of the underlying disorder. 
Imagine that each node hosts ki half-edges according to a distribution p{k), 
the total number of half-edges in the network being {k)N. Choose a node i 
at random and randomly join one of its half-edges to an available free half- 
edge. The probability that this half-edge ends at node j is kj/ {{k) N) . Once 
all the nodes have been linked up, the expected value (as a quenched averag^ 
over network realizations) for the number of edges joining nodes i and j iao 



^Assuming one edge at most between any two nodes, aij = 0, 1, the value will be slightly 
smaller, but it is easy to prove that this is also a good approximation if the network has a 
structural cut-ojf: ki < ^ {k)N, \fi. 
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E{aij) = kikj/ {{k) N). This expression, which can be se en as a definitio n 
of the so-called configuration model for complex networks f Newmanl . 2003c ). 
is valid for random networks with a given degree sequence (or, in practise, a 
given degree distribution) t hat ha. ve zero degree-degree correlations between 
neighbours fjjohnson et al.l . l2010bl ). Using the notation rji = ^iSi, we have 



rrii 



x{''lAj)i = ^ y^'i Vi O'ii- Because node activity is not statistically in- 
dependent of connectivity (ITorres et al.l . |200J), we must define a new set of 
overlap parameters, analogous to m and ruj. That is, /i„ = {k'^Tli)i/ {k'^) and 
the local versions /i{ = x{KvAj)i/ i^"')- After using ciij = E{dij), one ob- 
tains the relation /i^ = {k"'~^^)kifj,n+i/{{k"-){k)'^). Inserting this expression into 
the definition of fin, and substituting (sj) = tanh[T~^h1^-^ (S)] (for large N), 
standard mean-field analysis yields 



(rtanhMT,$(A;,t)) 



k ' 



where the last quantity is defined as 

k 



TN 



/ii(t) + ($-l) 



+1 



/il(t)r/ia+l(t) 



This is a two-dimensional map which is valid for any random topology of 
distribution p{k). Note that the macroscopic magnitude of interest is /iq = 



m 



A main consequence of this is the existence of a critical temperature, T^, un- 
der very general conditions. More specifically, as T is decreased, the overlap m 
describes a second-order phase transition from a disordered or, say, "paramag- 
netic" phase to an ordered ( "ferromagnetic" ) phase which exhibits associative 
memory. The mean-field temperature at which this transition occurs is 



{k)N' 



On the other hand, the map reduces to 



fXn (t + 1) = sign <^ Hn (t) 



l + ($-l) 



for T 



0. This implies the existence at $ = $0) where 



$0 = 1- 



{k^+^)'- 



of a transition as $ is decreased from the ferromagnetic phase to a new phase 
in which periodic hopping between the attractor and its negative occurs. This 
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is confirmed by the Monte Carlo simulations for M > 1; that is, the hopping is 
also among different attractors for finite T. The simulations also indicate that 
this transition washes out at low enough finite temperature. Instead, Monte 
Carlo evolutions show that, for a certain range of $ values, the system activity 
then exhibits chaotic behaviour. 

The transition from ferromagnetic to chaotic states is a main concern here- 
after. Our interest in this regime follows from several recent observations 
concerning the relevance of chaotic activity in a network. In particular, it has 
been shown that c haos might be responsible for certain states of attention dur- 



ing brain activity (ITorres et al. 



2008 



20091). and that some network prope rties 



such as the computational capacity ( Bertschinger and Natschlagerl . 2004) and 
the dynamic range of sensitivity to stimuli (Ide Assis and Copellil . l2008l ) may 
become optimal at the Edge of Chaos in a variety of settings. 

We next note that the critical values Tc and $o only depend on the mo- 
ments of the generic distribution p{k), and that the ratio a > 1, 
is a convenient way of characterizing heterogeneity. We studied in detail 
two particular types of connectivity distributions with easily tunable hetero- 
geneity; that is, networks with {k)N/2 edges randomly distributed with p{k) 
such that the heterogeneity depends on a single parameter. Our first case 
is the bimodal distribution, p{k) = ^6{k — ki) + ^6{k — k2) with parameter 
A = {k2 — ki)/2 = {k) — ki = k2 — {k). Our second case is the scale-free 
distribution, p{k) ~ k''', which does not have any characteristic size but k is 
confined to the limits, ko and km < mm{koN^ , N — 1) for finite N. Notice 
that the network in this case gets more homogeneous as 7 is incre ase and 
that this kind of distribution seems to be most relevant in nature (INewmanl . 



2003. 



3 



Boccaletti et al. 



20061 ). In particular, it seems important to mention 



that the functional topology of the human brain, as defined by correlated ac- 
tivity between small cluste rs of neurons, has be en shown to correspond to this 
case with exponent 7 ~ 2 (jEguiluz et al.l . 120051 ). (It has not yet been possible 
to ascertain the brain's structural topology experimentally, but t here is some 



evidence that function refiects structure at least to some extent fiZhou et al 



2006b| ). Furthermore, it has been suggested, based on indirect methods, that 



the structural connectivity of cat and macaque brai ns, at the level of brain 
areas, may indeed be scale free (IKaiser et al.l . 120071 ) - and in any case dis- 



'^The distribution is truncated and therefore not strictly scale free for 7 < 2. However, 
nature shows examples for which 7 is slightly larger than 1, so we consider the whole range 
here. 
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Figure 4.1: The temperature dependence of the difference between the values 
for the fatigue at which the ferromagnetic-periodic transition occurs, as ob- 
tained analytically for T = ($o) and from MC simulations at finite T ($c)- 
The critical temperature is calculated as Tc = {k"^) {{k) N)~'^ for each topology. 
Data are for bimodal distributions with varying A and for scale-free topologies 
with varying 7, as indicated. Here, {k) = 20, N = 1600 and a = 2. Standard 
deviations, represented as bars in this graph, were shown to drop with N~^^'^ 
(not depicted). 

plays significantly higher heterogeneity than that of, say, Erdos-Renyi random 
graphs.) 

We obtained the critical value of the fatigue, $c (T) , from Monte Carlo 
simulations at finite temperature T. These indicate that chaos never occurs for 
T > 0.35Tc. On the other hand, a detailed comparison of the value $c with $0 
- as obtained analytically for T = - indicates that $c — '^'o- 

Figure 14.11 illustrates the "error" $0 ~ {T) for different topologies. This 
shows that the approximation $c — '^'o is quite good at low T for any of the 
cases examined. Therefore, assuming the critical values for the main param- 
eters, Tc and $0) as given by our map, we conclude that the more heteroge- 
neous the distribution of connectivities of a network is, the lower the amount 
of fatigue, and the higher the critical temperature, needed to destabilize the 
dynamics. As an example of this interesting behaviour, consider a network 
with (k) = In(A^), and dynamics according to a = 2. If the distribution were 
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Figure 4.2: The critical fatigue values $o (solid lines) and $c from MC 
averages over 10 networks (symbols) with T = 2/N, (k) = 20, = 1600, 
a = 2. The dots below the lines correspond to changes of sign of the Lyapunov 
exponent as given by the iterated map, which qualitatively agree with the other 
results. This is for bimodal and scale-free topologies, as indicated. 



regular, the critical values would be = ln{N)/N (which goes to zero in the 
thermodynamic limit) and $o = 0. However, a scale-free topology with the 
same number of edges and 7 = 2 would yield Tc = 1 and $0 = 1 — 2(ln N^/N^ 
(which goes to 1 as — )■ 00). 

Figure l^TSl illustrates, for two topologies, the phase diagram of the ferromagnetic- 
chaotic transition. Most remarkable is the plateau observed in the Edge- of- 
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Chaos or transition curve for scale-free topologies around 7 ~ 2, for which 
very little fatigue, namely, $ < 1 which corresponds to slight depression, is 
required to achieve chaos. The limit 7 — i- 00 corresponds to (A;)-regular graphs 
(equivalent to A = 0). If 7 is reduced, increases and ko decreases. The 
network is truncated when k^ = N. It follows that a value of 7 exits at which 
ko cannot be smaller, so that km must drop to preserve (k). This explains the 
fall in $c as 7 — )■ 1. 

Assuming that the "ferromagnetic phase" here corresponds to a synchronous 
state, our results are i n qualitative agreement with the ones obt ained recently 



for coupled oscillators flNishikawa et al 



2003 



Zhou et al. 



2006al ) . As a matter 



of fact, the range of coupling strengths which allow for stability of synchronous 
states in these sys tems has been shown to dep end on the spectral gap of the 
Laplacian matrix ( jsarahona and Pecora . 2002 ). implying that the more het- 
erogeneous a topology is, the more easily activity can become unstable. It 
should be emphasized, however, that the dynamics we are considering here 
does not come within the scope of the formalism used to derive these results, 
since activity at node i depends on the local field at node j. 



4.4 Network performance 

As a further illustration of our findings, we monitored the performance as a 
function of topology during a simulation of pattern recognition. That is, we 
"showed" the system a pattern, say u chosen at random from the set of M 
previously stored, every certain number of time steps. This was performed in 
practice by changing the field at each node for one time step, namely, hi — >■ 
hi + SC,'^, where S measures the intensity of the input signal. Ideally, the 
network should remain in this configuration until it is newly stimulated. The 
performance may thus be estimated from a temporal average of the overlap 
between the current state and the input pattern, {rrf)time- This is observed 
to simply increase monotonically with A for the bimodal case. The scale-free 
case, however, as illustrated in Fig. 14. 3^ shows how the task is better performed 
the closer to the Edge of Chaos the network is. This is because the system is 
then easily destabihzed by the stimulus while being able to retrieve a pattern 
with accuracy. Figure 14.31 also shows that the best performance for the scale- 
free topology when $ = 1, i.e., in the absence of any fatigue, definitely occurs 
around 7 = 2. 
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Figure 4.3: Network "performance" (see the main text) against A for bimodal 
topologies (above) and against 7 for scale-free topologies (below). $ = 0.8 for 
the first case and $ = 1 in the second. Averages over 20 network realizations 
with stimulation every 50 MC steps for 2000 MC steps, S = 5 and M = 4; 



other parameters as in Fig. 16.51 Inset shows sections of typical time series 
of for A = 10 (above) and 7 = 4 (below); the corresponding stimulus for 
pattern u is shown underneath. 



4.5 Discussion 



The model network we have studied is one of the simplest relevant situations 
one may conceive. In particular, as emphasized above, we are greatly simply- 
fieng the elements at the nodes (neurons) as binary variables. However, our 
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assumption of dynamic connections which depend on the local fields in such 
a simple scenario happens to show that a close relation may exist between 
topological heterogeneity and function, thus suggesting this may indeed be a 
relevant property for a realistic network efficiently to perform certain high level 
tasks. In a similar way to networks s hown previously to be useful for pattern 
recognition and family identification (jCortes et al.l . l2005h . our system retrieves 
memory patterns with accuracy in spite of noise, and yet it is easily destabi- 
lized so as to change state in response to an input signal - without requiring 
excessive fatigue for the purpose. There is a relation between the amount $ 
of fatigue and the value of 7 for which performance is maximized. One may 
argue that the plateau of "good" behaviour shown around 7 ~ 2 for scale-free 
networks with $ < 1 (Fig. 16. 5p is a possible justification for the supposed 
tendency of certain systems in nature to evolve towards this topology. It may 
also prove useful for implementing some artificial networks. 



Chapter 5 

Correlated networks and 
natural disassortativity 



An intriguing feature of complex networks is the ubiquity of strong negative 
degree-degree correlations between neighbouring nodes - the only exceptions 
being social systems, which tend to be assortative instead of disassortative. 
With the double purpose of addressing this mystery and uncovering the effects 
of correlations on network behaviour, we put forward a method which allows 
for the model-independent study of ensembles of correlated networks. We go 
on to show, by means of an information theory approach, that the expected 
value of correlations for a network at equilibrium (i.e., in the absence of spe- 
cific correlating mechanisms) is not, as had been supposed, uncorrelated, but 
rahter disassortative. It turns out that the correlations of some networks are in 
excellent agreement with our predictions, while others, with known correlating 
or anticorrelating mechanisms, indeed appear to have been driven from their 
equihbrium points as expected. Therefore, our approach not only provides a 
parsimonious topological answer to a long-standing question, but also a neu- 
tral model against which to contrast experimental data to determine whether 
mechanisms must be sought to account for observed correlations. We go on 
to use our method, in Chapter El to study the influence of assortativity on 
neural-network dynamics. 



5.1 Assortativity of networks 

Complex networks, whether natural or artificial, have non-trivial topologies 
which are usually studied by analysing a variety of measu res, such as the degree 



distribution, clustering, average paths, modularity, etc. ( lAlbert and Barabasil . 
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2002; 


Doroeovtsev and Mendes 




2003; 


Pastor-Satorras and Vespip;nani 


Newman . 


2003c: 


Boccaletti et al. 


200f 


)) The mechanisms which lead to 



ticular structure and their relation to functional constraints are often not clear 



2003c 



Boccaletti et al- 



and c onstitute the subject of much debate ([Newman 
20061 ) ■ When nodes are endowed with some additional "property," a feature 
known as mixing or assortativity can arise, whereby edges are not placed be- 
tween nodes completely at random, but depending in some way on the property 
in question. If similar (dissimilar) nod es tend to wire together , the network is 



said to be assortative (disassortative) (Newman, 2OO2I . 2003a). 



An interesting situation is when the property taken into account is the 
degree of each node - i.e., the number of neighbouring nodes connected to it. 
It turns out that a high proportion of empirical networks - whether biological, 
technological, information-related or linguistic - are disassortatively arranged 
(high-degree nodes, or hubs, are preferentially linked to low-degree neighbours, 
and viceversa) while social networks are usually assortative. Such degree- 
degree correlations have important co nsequences for network characteristics 
such as connectedness and robustness (INewmanl . |2002| . l2003al ) . 

However, while a ssortativity in social ne tworks can be e xplained taking into 



accou nt homophily (INewman 



2002 



2003al ) or modularity (INewman and Park 



20031 ). the widespread prevalence and extent of disassortative mixing in most 
other networks remains somewhat mysterious. Maslov et al. found that the 
restriction of having at most one edge per pair of nodes induce s sorti e dis- 
assortative correlations in heterogeneous networks (IMaslov et al.l . 120041 ). and 
Park and Newman showed how this analogue of the Pauli exclusi o n prin ciple 
leads to the edges following Ferm i statistics (jPark and Newmanl . l2003l ) (see 
also (jCapocci and Colaioril . 120061 )). However, this restriction is not sufficient 
to fully account for empirical data. In general, when one attempts to consider 
computationally all the networks with the same distri bution as a given empir i- 
cal one, the mean assortativity is not necessarily zero ( iHolme and Zhad . 120071 ). 
But since some "ran domization" mechanisms induce positive c orrelations and 



others negative ones (iFarkas et al 



2004 



Johnson et al. 



2010al ) , it is not clear 



how the phase space can be properly sampled numerically. 

In this chapter we develop a method for the study of correl ated networks 



which is model-independent, and describe the main result of Ref. (jjohnson et al. 
2010bl ) - namely, that there is a general reason, consistent with empirical 



data, for the "natural" mixing of most networks to be disassortative. Us- 
ing an information-theory approach we find that the configuration which can 
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be expected to come about in the absence of specific additional constraints 
turns out not to be, in general, uncorrelated. In fact, for highly heterogeneous 
degree distributions such as those of the ubiquitous scale-free networks, we 
show that the expected value of the mixing is usually disassortative: there are 
simply more possible disassortative configurations than assortative ones. This 
result provides a simple topological answer to a long-standing question. Let us 
caution that this does not imply that all scale-free networks are disassortative, 
but only that, in the absence of further information on the mechanisms behind 
their evolution, this is the neutral expectation. 



5.2 The entropy of network ensembles 

The topology of a network is entirely described by its adjacency matrix a; the 
element Sjj represents the number of edges linking node i to node j (for undi- 
rected networks, a is symmetric). Among all the possible microscopically dis- 
tinguishable configurations a set of L edges can adopt when distributed among 
N nodes, it is often convenient to consider the set of configurations which have 
certain features in common - typically some macroscopic magnitude, like the 
degree distribution. Such a set of configurations defines an ensemble. In a 
seminal series of papers Bianconi has determined the partition functions of 
various ensembles of random networks and derived their statistical-mechanics 



entropy (IBianconil . |2008| . 



2009: lAnand and Bianconi 



20091 ). This allows the 



author to estimate the probability that a random network with certain con- 
straints has of belonging to a particular ensemble, and thus assess the relative 
importance of different magnitudes and help discern the mechanisms respon- 
sible for a given real-world network. For instance, she shows that scale-free 
networks arise naturally when the total entropy is restricted to a small finite 
value. Here we take a similar approach: we obtain the Shannon information 
entropy encoded in th e distribution of edges. As we shall see, b oth methods 

but for our 



yield the same results (jjaynes 



1957 



Anand and Bianconi! 



1 l2009l ) 



purposes the Shannon entropy is more tractable. 

The Shannon entropy associated with a probability distribution pm is 



where the sum extends over all possible outcomes m. For a given pair of 
nodes Pm can be considered to represent the probability of there being 

m edges between i and j. For simplicity, we shall focus here on networks such 
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that ttij can only take values or 1, although the method is applicable to any 
number of edges allowed. In this case, we have only two terms: pi = iij and 
Po = 1 — iij, where iij = E{aij) is the expected value of the element hij given 
that the network belongs to the ensemble of interest. The entropy associated 
with pair (i, j) is then 



- \^io In(eij) + (1 - iij) ln(l - iij)\ 
while the total entropy of the network is S 



s 



N 

E 



[iij \n{iij) + (1 - iij) ln(l - iij)] . 



{5.1] 



Since we have not imposed symmetry of the adjacency matrix, this expression 
is in general valid for directed networks. For undirected networks, however, 
the sum is only over i < j, with the consequent reduction in entropy. 

For the sake of illustration, we shall estimate the entropy of the Internet 
at t he autonomous system (AS) level and comp a re it w ith the values obtained 



in ( iBianconil . 



2008 



2009 



Anand and Bianconi 



20091 ) assuming the network 



belongs to two different ensembles: the fully random graph, or Erdos-Renyi 
(ER) ensemble, and t he confjQuration ensemble with a scale- free degree distri- 



bution (p 



( k) ^ k-^) (IN 



(IBianconi 



2008 



20091 : 



ewmanl. 



2003c) and structu ral cutoff, ki < ■>/ {k)N, Vz 



Anand and Bianconi 



20091 ) {{k) is the mean degree). 



In this example, we assume the network to be sparse enough to expand the 
term ln(l — Qj) in Eq. (15.11) and keep only linear terms. This reduces Eq. (15.11) 
to 



N 



^sparse — 



^eij[ln(t 



l] + 0(et,). 



In the ER ensemble, each of nodes has an equal probability of receiving 
each of \{k)N undirected edges. So, writing if^ = {k)/N, we have 

SER = -l{k)N[ln{{k)/N)-l]. 
The configuration ensemble, which imposes a given degree se quence (k^ , ...fc/y) 



is defined via the ex pected value of the adjacency matrix f lNewman 



Johnson et al 



20081): 



2008c 



hk,/{{k)N). 



This value leads to 



Sc = {k)N[\n{{k)N) + 1] - 2A^(A;lnA;), 
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Figure 5.1: Evolution of tlie Internet at tlie AS level. Empty (blue) squares 
and circles: entropy per node of randomized networks in the fully random and 
in the configu ration en s embles, as obtained by Bianconi (henc e the "B" su- 



perscription) f lBianconi 



2008 



2009 



Anand and Bianconi 



20091). Filled (red) 



triangles and diamonds: Shannon entropy for an ER network and a scale-free 
one with 7 = 2.3, respectively. 



where {■) = N stands for an average over nodes. 



Fig. I5.1 | displavs t he en tropy per node obtained in flBianconi 



2008 



2009; 



Anand and Bianconi . 12009 ) for the first two levels of approximation (ensem- 



bles) to the Internet at the AS level, first taking into account only the numbers 
of nodes and edges L = ^{k)N, and then also the degree sequence. Along- 
side these, we plot the Shannon entropy both for an ER random network, 
(which coincides exactly with Bianconi's expression), and for a scale- free net- 
work with 7 = 2.3 (the slight disparity arising from this exponent's changing 
a little with time). 



5.3 Entropic origin of disassortativity 

We shall now go on to analyse the effect of degree-degree correlations on the 
entropy. In the configuration ensemble, the expected value of the mean degree 
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Figure 5.2: Shannon entropy of correlated scale-free networks against param- 
eter P (left panel) and against Pearson's coefficient r (right panel), for various 
values of 7 (increasing from bottom to top), (k) = 10, = 10^. 



of the neighbours of a given node is 

k ■ — \^ p'^ k ■ 



which is independent of ki. However, as mentioned above, real networks often 
display degree-degree correlations, with the result that knn,i = knniki). If 
knn{k) increases (decreases) with k, the network is assortative (disassortative). 



' Newman , 


2003c. 


2002. 


2003a: 



Boccaletti et al. 



20061 ): 



[kk[] 



where ki and fc[ are the degrees of each of the two nodes belonging to edge /, 
and [■] = {{k)N)~'^ is an average over edges. Writing = 

r can be expressed as 



2\2 



{k){k'kUk)) - jk ') 

{k){k^) - {k^Y 



(5.2) 



The ensemble of all networks with a given degree sequence (fci, ...k^) contains a 
subset for all members of which knn{k) is constant (the configuration ensemble), 
but also subsets displaying other functions /c„„(A;). We can identify each one 
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of these subsets (regions of phase space) with an expected adjacency matrix e 
which simultaneously satisfies the following conditions: 

i) ^ ^ ^j^ij ~ ^i^nnik-i)-, Vi, and 

j 

ii) ^^^jj = (for consistency). 

3 

An ansatz which fulfils these requirements is any matrix of the form 



{k)N 



+ / du 



{kikj 



k- -K + {k^ 



(5.3) 



where z/ G M and the function /(z/) is in general arbitrary, although depending 
on the degree sequence it shall here be restricted to values which maintain 



eij G [0, 1], Vz, j. This ansatz yields 



knni^k^ 



(k) 



+ / dvf{v)a^,+i 



V{k^) 



(5.4) 



(the first term being the result for the configuration ensemble), where Ub+i = 
{k^+^) - {k){k^). In practice, one could adjust Eq. (15. 4p to fit any given func- 
tion knn{k) and then wire up a network with the desired correlations: it suffices 
to throw random numbers according to Eq. (15.31) with f{v) as obtained from 
the fit to Eq. fl5.4|l R. To prove the uniqueness of a matrix e obtained in this 
way (i.e., that it is the only one compatible with a given knn{k)) assume that 
there exists another valid matrix e' ^ e. Writting e[j — iij = h{ki,kj) = hij, 
then i) implies that kjhij = 0, Vi, while ii) means that J2j ^ij = 0, Vi. It 
follows that hij = 0, 



In man y empirical networks, knn(k) has the form knn i k) = A + Bk^, with 



A,B>0 (IBoccaletti et al. 



2006 



Pastor-Satorras et al 



200l[ ) - the mixing 



being assortative (disassortative) if /3 is positive (negative). Such a case is 
fitted by Eq. if 



/(^) = c 



5(z/-/3-l) 



5{u 



^Although, as with the configuration ensemble, it is not always possible to wire a network 
according to a given e. 
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with C a positive constant, since this choice yields 

1 

{k) ' [(A;/3+i) {k)_ ■ 

After plugging Eq. (15.51) into Eq. fl5.2p . one obtains: 



knn{k) = ^ + Ca2 



Ca2 f {k){k^+^) - (P)(A:^+i) 
"(P+i) V {k){k^) - (A;2)2 



(5.5) 



(5.6) 



Inserting Eq. (15. 3p in Eq. (15. ip . we can calculate the entropy of correlated 
networks as a function of /3 and C - or, by using Eq. (15. 6p . as a function of r. 
Particularizing for scale-free networks, then given (k), N and 7, there is always 
a certain combination of parameters (3 and C which maximizes the entropy; we 
shall call these /3* and C*. For 7^5/2 this point corresponds to C* = 1. For 
higher 7, the entropy can be slightly higher for larger C. However, for these 
values of 7, the assortativity r of the point of maximum entropy obtained with 
C = 1 differs very little from the one corresponding to /3* and C* (data not 
shown). Therefore, for the sake of clarity but with very little loss of accuracy, 
in the following we shall generically set C = 1 and vary only /3 in our search 
for the level of assortativity, r*, that maximizes the entropy given (k), N and 
7. Note that C = 1 corresponds to removing the linear term, proportional 
to kikj, in Eq. (15. 3p . and leaving the leading non-linearity, {kikj)^~^^, as the 
dominant one. 



Fig. 15.21 displays the entropy curves for various scale-free networks, both 
as functions of /3 and of r: depending on the value of 7, the point of max- 
imum entropy can be either assortative or disassortative. This can be seen 
more clearly in Fig. 15. 3^ where r* is plotted against 7 for scale-free networks 
with various mean degree s (k). The values obtained by Park and Newman 
( Park and Newman . \ mA as those resulting from the one-edge-per-pair re- 
striction are also shown for comparison: notice that whereas this effect alone 
cannot account for the Internet's correlations for any 7, entropy considerations 
would suffice if 7 ^ 2.1. As shown in the inset, the results are robust in the 
large system-size limit. 

Since most networks observed in the real world are highly heterogeneous, 
with exponents in the range 7 G (2,3), it is to be expected that these should 
display a certain disassortativity - the more so the lower 7 and the higher {k) . 
In Fig. 15.41 we test this prediction on a sarnple of empirical, scale-free net- 
works quoted in Newman's review (INewmanl . l2003d ) (p. 182). For each case, 
we found the value of r that maximizes 5* according to Eq. (15. ip . after insert- 
ing Eq. (15. 3 p with the quoted values of (k), N and 7. In this way, we obtained 
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Figure 5.3: Lines from top to bottom: r at wliicli tlie entropy is maximized, 
r*, against 7 for random scale-free networks witli mean degrees (k) = i, 1, 2 
and 4 times ko = 5.981, and N = Nq = 10697 nodes (fcn and iVn correspond 
to th e values for the Internet at the AS level in 2001 (jPark and Newmanl . 
2003[ ) . which had r = = —0.189). Symbols are the values obtained in 



(jPark and Newmanl . 120031 ) as those expected solely due to the one-edge-per- 
pair restriction (with ko, Nq and 7 = 2.1, 2.3 and 2.5). Inset: r* against for 
networks with fixed {k)/N (same values as the main panel) and 7 = 2.5; the 
arrow indicates N = Nq. 



the expected assortativity for six networks, representing: a peer-to-peer (P2P) 
network, metabolic reactions, the n d.edu dornain, act or collaborations, protein 
interactions, and the Internet (see JNewmanl . l2003ch and references therein). 
For the metabolic, Web domain and protein networks, the values predicted are 
in excellent agreement with the measured ones; therefore, no specific anticor- 
relating mechanisms need to be invoked to account for their disassortativity. 
In the other three cases, however, the predictions are not accurate, so there 
must be additional correlating mechanisms a t work. Indeed, it is known that 
small routers tend to connect to large ones (jPastor-Satorras et al.l . l200ll ). so 
one would expect the Internet to be more disassortative than predicted, as is 
the cas^ - an effect that is less pronounced but still detectable in the more 



^However, as Fi^. | 5. 31 show s, if the Internet exponent were the 7 = 2.2 ±0.1 reported 
in ([Pastor-Satorras et al.l . I2OOII ) rather than 7 = 2.5, entropy would account more fully for 
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Figure 5.4: Level of assortativity that maximizes the entropy, r*, for various 
real-world, scale-free networks, as predicted theoretically by Eq. flS.ip (circles) 
and as directly measured (horizontal lines), against exponent 7. 



egalitarian P2P network. Finally, as is typical of social networks, the actor 
graph is significantly more assortative than predicted, probably due to the ho- 
mophily mechanism whereby hig hly connected, big-name actors tend to work 
together (INewmanl . 



2002 



2003ah . 



5.4 To sum up... 

We have shown how the ensemble of networks with a given degree sequence 
can be partitioned into regions of equally correlated networks and found, using 
an information-theory approach, that the largest (maximum entropy) region, 
for the case of scale-free networks, usually displays a certain disassortativity. 
Therefore, in the absence of knowledge regarding the specific evolutionary 
forces at work, this should be considered the most likely state. Given the 
accuracy with which our approach can predict the degree of assortativity of 
certain empirical networks with no a priori information thereon, we suggest 
this as a neutral model to decide whether or not particular experimental data 
require specific mechanisms to account for observed degree-degree correlations. 



these correlations. 



Chapter 6 



Enhancing robustness to noise 
via assortativity 

As we saw in Chapter IU the performance of attractor neural networks depends 
crucially on the heterogeneity of the underlying topology's degree distribution. 
We take this analysis a step further by examining the effect of degree-degree 
correlations - or assortativity - on neural- network behaviour. In Chapter Owe 
described a method for studying correlated networks and dynamics thereon, 
both analytically and computationally, which is independent of how the topol- 
ogy may have evolved. We now make use of this to show how the robustness to 
noise is greatly enhanced in assortative (positively correlated) neural networks, 
especially if it is the hub neurons that store the information. 



6 . 1 Background 



For a dozen years or so now, the study of complex systems has been heavily 
influenced by results from network science - which one might regard as the fu 



lular function f Siiel 



et al 



sion of graph theory with statistical physics ( Newman . 2003^: iBoccaletti et al. 
2006 ). Phenor nena as diverse a s epidemics ( Watts and Strogatzl. 



20061) . power -grid failures ( iBuldyrev et al 



1998 



cel- 



201( 



or inte rnet routing (IBoguna et al.l . |2010[ ) , among many others (lArenas et al.l . 
2008al ). depend crucially on the structure of the underlying network of in- 
teractions. One of the earliest systems to have been described as a net- 
work was the brain, which is made up of a grea t many neurons connected 



to each other by synaps e s (|v Caial 



1990 



Torres and Varona 



1995 



Amit 



1989 



Abbott and Keplerl . 



20101). Mathe matically, the first neural ne tworks 



combined the Ising model (Baxter, 1982) with the Hebb learning rule (jHebb . 
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19491 ) to rep r oduce , very successfu ll y, the storag e and retrieval of informa 



1972 



Hopfield ■ 



1982 



Amit 



19951 ). Neurons were simplified 



tion (lAmari . 

to binary variables (like Ising spins) representing firing or non-firing cells. 
By considering the trivial fully-connected topology, exact solutions could be 
reached, which at the time seemed more important than attempting to in- 
troduce biological realism. Subsequent work has tended to focus on consider- 
ing richer dyna mics for the cells ra t her than on t he w a y in which these ar e 



interconnected (IVogels et al. 



2005 



Torres et al 



2007 



Mejias et al. 



2010h . 



However, the topology of the brain - whether at the level of neurons and 



syn apses, cortica. 



ial flAmaral et al. 



2008 



3 



areas or functional connections - is obviously far from triv 



2000; 



Sporns et al. 



BuUmore and Spornsl . 



2009 



2004' 



Eguiluz et al. 



Johnson et al. 



2005 



Arenas et al. 



2010ah . 



The number of neighbours a given node in a network has is called its degree, 
and much attention is paid to degree distributions since they tend to be highly 



heterogeneous for most real networks. In 



scale- 



2006 



ree i.e.. 



Peretto 



desc r ibed by power laws 



1992 



Barabasi and Oltvai 



act, they are often approximately 



(iNewmaru . 



2003c: 



Boccaletti et al. 



20041 ). By including t 



lis topologi- 



cal fea ture in a Hopfield-like neural-network model, Torres et al. iTorres et al 
(120041 ) found that degree heterogeneity increases the system's performance at 
high levels of noise, since the hubs (high degree nodes) are able to retain 
information at levels well above the usual critical noise. To prove this ana- 
lytically, the authors considered the configurational ensemble of networks (the 
set of random networks with a given degree distribution but no degree-degree 
correlations) and showed that Monte Carlo (MC) simulations were in good 
agreement with mean-field analysis, despite the approximation inherent to the 
latter technique when the network is not fully connected. A similar approach 
can also be used to show how heterogeneity may be advantage ous for the per- 



form ance of certain tasks in models with a richer dynamics (jjohnson et al 



20081 ) . It is worth mentioning that this influence of the degree distribution on 



dynamical behaviour is found in many othe r settings, such as the r nore g eneral 
situation of systems of coupled oscillators ( iBarahona and Pecoral . l2002l ). 

Another property of empirical networks that is quite ubiquitous is the exis- 
tence of correlations be t ween t he degree s of nodes and those of their neighbours 



f Pastor-Satorras et al. 



2001 



iNewman 



2002 



2003aJ). If the average degree- 



degree correlation is positive the network is said to be assortative, while it is 
called disassor tative if negativel y correlated. Most heterogeneous networks are 
disassortative (iNewmanl . l2003cl ) . which, as described in Chapter El seems to be 
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because this is in some sense their equihbrium (maxim um entropy) st a te give n 



the constraints imposed by the degree distribution (jjohnson et al. 



2ninbh . 



However, there are probably often mechanisms at work which drive systems 
from equilibrium by inducing different correlations, as appears to be the case 
for most social networks, in which nodes (people) of a kind tend to group 
together. This feature, known as assortativity or mixing by degree, is also 
relevant for processes taking place on networks. For instance, assortative net- 
works hav e lower percolation thresholds and are more robust to targeted attack 
( iNewmanl . l2003al ). while disassortative ones make for more stable ecosystems 
an d are - at leas t according to the usual definition - more synchronizable 



(iBrede and Sinhal ). 



The approach usually taken when studying correlated networks computa- 
tionally is to generate a network from the configuration ensemble and then 
intro duce correlations (p ositive or negative) by some stochastic rewiring pro- 
cess ( iMaslov et al.l . |2004( ) . A drawback of this method, however, is that results 
may well then depend on the details of this mechanism: there is no guarantee 
that one is correctly sampling the phase space of networks with given corre- 
lations. For analytical work, some kind of hidden variables from which the 



corre l ations originate are often considered (ICaldarelli et al. 



2002 



Boguna and Pastor-Satorras 



2003 



2002 



Fronczak and Fronczak 



Soderberd . 



20061 ) 



an 



assumptio n which can also be used to genera te correlated networks compu- 
tationally f lBoguna and Pastor-Satorrasl . l2003l ). This can be a very powerful 
method for solving specific network models. However, it may not be appropri- 
ate if one wishes to consider all possible networks with given degree-degree cor- 
relations, independently of how these may have arisen. In this chapter, we get 



round t he problem by making use of the method put forward by I Johnson et al. 
teOlObI ) (and described in Chapter [5]) whereby the ensemble of all networks 
with given correl ations can be considered theoretically without recurring to 



hidden variables f ide Franciscis et al. . I2OI1I ). Furthermore, we show how this 
approach can be used computationally to generate random networks that are 
representative of the ensemble of interest (i.e., they are model-independent). In 
this way, we study the effect of correlations on a simple neural network model 
and find that assortativity increases performance in the face of noise - partic- 
ularly if it is the hubs that are mainly responsible for storing information (and 
it is worth mentioning that there is experimental ev idence suggestive of a main 



functional role played by hub neurons in the brain ( iMorgan and Soltesz 



Bonifazi et al. 



2008 



2OO9I )). The good agreement between the mean-field analysis 
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and our MC simulations bears witness both to the robustness of the results as 
regards neural systems, and to the viability of using this method for studying 
dynamics on correlated networks. 



6.2 Preliminary considerations 



6.2.1 Model neurons on networks 



The attractor neural network model put forward by Hopfield flHopfieldl . Il982[ ) 
consists of binary neurons, each with an activity given by the dynamic vari- 
able Si = ±1. Every time step (MCS), each neuron is updated according to 
the stochastic transition probability P{si — ±1) = | [1 ± tanh [hi/T)] (paral- 
lel dynamics), where the field hi is the combined effect on i of all its neighbours, 
hi = WijSj, and T is a noise parameter we shall call temperature, but which 
represents any kind of random fluctuations in the environment. This is the 
same as the Ising model for magnetic systems, and the transition rule can be 
derived from a simple interaction energy such that aligned variables s (spins) 
contribute less energy than if they were to take opposite values. However, 
this system can store P given configurations {memory patterns) ^i = ±1 by 
havi ng the i ntera ction strengths [synaptic weights) set according to the Hebb 
rule ( Hebbl . Il949 ): Wij oc J2^=i^i^j ■ ^^is way, each pattern becomes an 
attractor of the dynamics, and the system will evolve towards whichever one is 
closest to the initial state it is placed in. This mechanism is called associative 
memory, and is nowadays used routinely for tasks such as image identifica- 
tion. What is more, it has been established that something similar to the 
Hebb rule is implemented in nature v ia the processes of long-te r m potenti- 



ation and depression at the synapses ( Malenka and NicoU 



2008 



Rodriguez-Moreno and Paulsen 



200 



1999 



Kwag and Paulsenl. 120091) . and 



this phenomenon is indeed required for learning (iGruart et al 



RooetaL 



20061 



To take into account the topology of the network, we shall consider the 
weights to be of the form Wij = Uijdij, where the element dij of the adjacency 
matrix represents the number of directed edges (usually interpreted as synapses 
in a neural network) from node j to node i, while u stores the patterns, as 
before: 



Ui 



1 ^ 



u=l 



For the sake of coherence with previous work, we shall assume d to be sym- 
metric (i.e., the network is undirected), so each node is characterized by a 
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single degree ki = dij. However, all results are easily extended to directed 
networks - in which nodes have both an in degree, kf^ = hij, and an out 
degree, kf^^ = dji - by bearing in mind it is only a neuron's pre-synaptic 
neighbours that influence its behaviour. The mean degree of the network is 
(k), where the angles stand for an average over node£]: (■) = A^~^ 



6.2.2 Network ensembles 

When one wishes to consider a set of networks which are randomly wired while 
respecting certain constraints - that is, an ensemble - it is usually useful to 
deflne the expected value of the adjacency matrix^], E{d) = e. The element 
iij of this matrix is the mean value of dij obtained by averaging over the en- 
semble. For instance, in the Erdos-Renyi (ER) ensemble all elements (outside 
the diagonal) take the value efj^ = {k)/N, which is the probability that a 
given pair of nodes be connected by an edge. For studying networks with a 
given degree sequence, (/ci, .../cat), it is common to assume the configuration 
ensemble, defined as 

hh- 

conf rh^rhj 

This expression can usually be applied also when the constraint is a given de- 
gree distribution, p{k), by integrating over p{k.i) and p{kj) where appropriate. 
One way of deriving e'^°^^ is to assume one has ki dangling half-edges at each 
node we then randomly choose pairs of half-edges and join them together 
until the network is wired up. Each time we do this, the probability that we 
join i to j is kikj/{{k)Ny, and we must perform the operation {k)N times. 
Bianco ni showed that this is also the solution for Barabasi- Albert evolved net- 
works (IBianconil . |2002| ). However, we should bear in mind that this result is 
only strictly valid for networks constructed in certain particular ways, such 
as in these examples. It is often implicitly assumed that were we to average 
over all random networks with a given degree distribution, the mean adjacency 
matrix obtained would be gg°" /. However, as we dis cussed in Chapter El this 



is not in fact necessarily true (jjohnson et al.l . l2010bl ) . 



^In directed networks the mean in degree and the mean out degree necessarily coincide, 
whatever the forms of the in and out distributions. 

^As in statistical physics, one can consider the microcanonical ensemble, in which each 
element (network) satisfies the con straints exactly, or the canonical ensemble, where the 
constraints are satisfied on average (jBianconi 120091 ) . Throughout this work, we shall refer 
to canonical ensembles. 
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10^ 10^ 
k 



Figure 6.1: Mean-nearest-neighbour functions knn{k) for scale-free networks 
with (3 = —0.5 (disassortative), 0.0 (neutral), and 0.5 assortative, generated 
according to the algorithm described in Sec. 16.3.21 Inset: degree distribution 
(the same in aU three cases). Other parameters are 7 = 2.5, (k) = 12.5, 
N = 10\ 

6.2.3 Correlated networks 

In the configuration ensemble, the expected value of the mean degree of the 
neighbours of a given node is knn,i = k~^ i'^""'^ kj = which is in- 

dependent of ki. However, as mentioned above, real networks often display 
degree-degree correlations, with the result that knn,i = knn{ki). If knn{k) in- 
creases with fc, the network is said to be assortative - whereas it is disassorta- 
tive if it decreases with k (see Fig. 16. ip . This is from the more general nomen- 
clature (borrowed form sociology) in which sets are assortative if elements of 
a kind group together, or assort. In the case of degree-degree correlated net- 
works, positive assortativity means that edges are more than randomly likely 
to occur between nodes of a similar degree. 

The ensemble of all networks with a given degree sequence (fci, ...kjsf) con- 
tains a subset for all members of which knn{k) is constant (the configuration 
ensemble), but also subsets displaying other functions knn{k). We can iden- 
tify each one of these subsets (regions of phase space) with an expected ad- 
jacency matrix e which simultaneously satisfies the following conditions: i) 
Y^jkjiij = kikrm{ki), \fi (by definition of Knik)), and ii) Y^j^ij = ^i, Vi (for 
consistency) . As we showed in Chapter \5[ the general solution to this problem 
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is a matrix of the form 



in J 



{k)N 



+ / du 



{kikj 



k"; + (^0 



(6.1) 



where z/ G M and the function /(z/) is determined by knn{k) f jJohnson et al.l . 



2010b|). (If the network were directed, then ki 
expression.) This yields 



k}^ and ki 



knni^k^ 



(k) 



kf^^ in this 



(6.2) 



(the first term being the result for the configuration ensemble), where cxb+i = 
{k^^^) — {k){k^). This means that e is not just one possible way of obtaining 
correlations according to knn{k)', rather, there is a two-way mapping between 
e and knn{k)'- every network with this particular function knn{k) and no other 
ones are contained in the ensemble defined by e. Thanks to this, if we are 
able to consider random networks drawn according to this matrix (whether 
we do this analytically or computationally; see Section r6.3.2p . we can be con- 
fident that we are correctly taking account of the whole ensemble of interest. 
In other words, whatever the reasons behind the existence of degree-degree 
correlations in a given network, we can study the effects of these with only 
information on p{k) and knn{k) by obtaining the associated matrix e. This is 
not to say, of course, that all topological properties are captured in this way: a 
particular network may have other features - such as higher order correlations, 
modularity, etc. - the consideration of which would require concentrating on 
a sub-partition of those with the same p{k) and knn{k). But this is not our 
purpose here. 



A,B>0 ( IBoccaletti et al. 



2006 



In man y empirical networks, knn(k) has the form knn j k) = A -|- Bk^, with 

20011 ) - the mixing 



Pastor-Satorras et al. 



being assortative if (3 is positive, and disassortative when negative. Such a 
case is fitted by Eq. fl6.2p if 



/(^) = c 



■(5(z/ - /3 - 1) - (5(z/ - 1) 



(6.3) 



with C a positive constant, since this choice yields 
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In Chapter [5] we discussed how the most hkely configurations for networks 
with scale-free degree distributions {p{k) ~ k~'^) and correlations given by 
Eq. fl6.4p are generally disassortative. We also showed that the maximum 
entropy is usually obtained for values of C close to one. Here, we shall use 
this result to justify concentrating on correlated networks with C = 1, so that 
the only parameter we need to take into account is p. It is worth mentioning 
that Pastor-Satorras et al. originally suggested us i ng th is exponent as a way 
of quantifying correlations (jPastor-Satorras et al.l . 1200 ll ). since this seems to 
be the most relevant magnitude. Because (3 does not depend directly on p{k) 
(as r does), and can be defined for networks of any size (whereas r, in very 
heter ogeneous networks, alway s goes to zero for large due to its normaliza- 
tion ( jPorogovtsev et al.l . |2005[ )). we shall henceforth use (3 as our assortativity 
parameter. 

So, after plugging Eq. (16. 3p into Eq. (16. ip . we find that the ensemble of 
networks exhibiting correlations given by Eq. (16. 4p (and C = 1) is defined by 
the mean adjacency matrix 

1 

0-2 1 



+ 



■\ki + kj~ {k)] 
\hk,r' 



{k^+^) 



(6.5) 



6.3 Analysis and results 



6.3.1 Mean field 



Let us consider the single-pattern case (P = 1, = Q). Substituting the 
adjacency matrix a for its expected value e (as given by Eq. (16.51) ) in the ex- 
pression for the local field at i - which amounts to a mean-field approximation 
- we have 



hi 



(k) 



{h-{k))+^{{kn-kr) 



+ {k)^^, + ^{k^-{k^^'))^^, 



0"/3+2 



+1 



where we have defined 



for a = 0, 1, /3 + 1. These order parameters measure t he extent to which th e 



system is able to recall information in spite of noise ( 1 Johnson et al. 



20081 ). 
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For the first order we have /iq = m = {^iSi), the standard overlap measure 
in neural networks (analogous to magnetization in magnetic systems), which 
takes account of memory performance. However, /ii, for instance, weighs the 
sum with the degree of each node, with the result that it measures information 
per synapse instead of per neuron. Although the overlap m is often assumed 
to represent, in some sense, the mean firing rate of neurological experiments, 
it is possible that fii is more closely related to the empirical measure, since the 
total electric potential in an area of tissue is likely to depend on the number of 
synapses transmitting action potentials. In any case, a comparison between the 
two order parameters is a good way of assessing to what extent the performance 
of neurons depends on their degree - larger-degree model neurons can in general 
store information a t higher temperatures than ones with smaller degree can 



fiTorres et all . 12004 ). 

Substituting for its expected value according to the transition probability. 
Si ta.iah{hi/T), we have, for any a, 

(fcfe^s.) = (A;re.tanh(/i,/r)); 

or, equivalently, the following 3-D map of closed coupled equations for the 
macroscopic overlap observables /xq, /ii and /i/3+1 - which describes, in this 
mean-field approximation, the dynamics of the system: 

/io(t + l) = [ pik)tanh[F{t)/{{k)T)]dk 



/ii(t + l) = [ p{k)ktanh[F{t)/{{k)T)]dk (6.6) 

/^/3+i(t + l) = j^^Jp{k)k^+Hanh[F{t)/{{k)T)]dk, 

with 

F{t) = {klJo{t) + {k)l2i{t)-{k)fio{t)) 



This can be easily computed for any degree distribution p{k). Note that taking 
/3 = (the uncorrelated case) the system collapses to the 2-D map obtained by 



70 



Chapter 6. Enhancing robustness to noise via assortativity 



1-D case for a homogeneous 



19821 ) ■ It is in principle pos- 



Torres et al.l ( l2004l ). while it becomes the typica 
p{k) - say a fully-connected network (jHopfield . 
sible to do similar mean-field analysis for any number P of patterns, but the 
map would then be 3P-dimensional, making the problem substantially more 
complex. 

At a critical temperature Tc, the system will undergo the characteristic 
second order phase transition from a phase in which it exhibits memory (akin 
to ferromagnetism) to one in which it does not (paramagnetism). To obtain 
this critical temperature, we can expand the hyperbolic tangent in Eqs. (16. 6p 
around the trivial solution (/iq, /xi, — (0,0,0) and, keeping only linear 
terms, write 



[{ky^l + CT2/i/3+l] 



/^/3+l 



Defining 



+ 



(^2 



A 



B 



D 



q-2 

{kr 

C^/3+2 



(A;)(A;/3+i)' 

Tc will be the solution to the third order polynomial equation: 



-{B + V)Tl + {B- A)Tc + A{B -D) = 0. 



(6.7) 



Note that for neutral (i.e., uncorrelated) ne tworks, = 0, and so A = B = D. 
We then have Tc = (k'^)/(k)'^, as expected (j Johnson et al. , 2008 ). 
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6.3.2 Generating correlated networks 



Given a degree distribution p{k), the ensemble of networks compatible with 
this constraint and with degree-degree correlations according to Eq. (16 ■4p 
(with some exponent /3) is defined by the mean adiacency mat rix e of Eq. 



fl6.5j) - as described in Section [6.2.31 and by I Johnson et al.l ( l2010bl ). Therefore, 
although there will generally be an enormous number of possible networks in 
this volume of phase space, we can sample them correctly simply by generating 
them according to e. To do this, first we have to assign to each node a degree 
drawn from p{k). If the elements of e were probabilities, it would suffice then 
to connect each pair of nodes (z,j) with probability iij to generate a valid 
network. Strictly speaking, e is an expected value, which in certain cases can 
be greater than one. To get round this, we write a probability matrix p = e/a 
with a some value such that all elements of p are smaller than one. If we 
then take random pairs of nodes and, with probability pij, place an edge 
between them, repeating the operation until \{k)N edges have been placed, 
the expected value of edges joining i and ?' will be ejj- T h is me thod is like the 



hidden variable technique (jBoguiia and Pastor-Satorrad . 120031 ) in that edges 
are placed with a predefined probability (which is why the resulting ensemble 
is canonical). The difference lies in the fact that in the method here described 
correlations only depend on the degrees of nodes. 



We are interested here in neural networks, in which a given pair of nodes 
can be joined by several synapses, so we shall not impose the restriction of 
so-called simple networks of allowing only one edge at most per pair. We 
shall, however, consider networks with a structural cutoff: ki < a/ {k)N^ Vi 
(iBianconil . |2008| ). This ensures that, at least for (3 < 0, all elements of e are 
indeed smaller than one. 



Because we can expect effects due to degree-degree correlations to be largest 



when p{k) is very broad, and since most networks in nature anc 
ogy seem to exhibit approximately power-law degree distributions 


technol- 
Newman , 


2003c 


Arenas et al. 


2008a 


Peretto 


1992; 


Barabasi and Oltvai 


2004 


) , we shall 



here test our general theoretical results against simulations of scale-free net- 
works: p{k) ~ k~'^. This means that a network (or the region of phase space 
to which it belongs) is characterized by the set of parameters {(fc). A, 7, /?}. 
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Figure 6.2: Stable stationary value of the weighted overlap /ii against tem- 
perature T for scale-free networks with correlations according to knn ~ k^, for 
/3 = —0.5 (disassortative), 0.0 (neutral), and 0.5 ( assort at ive). Symbols from 
MC simulations, with errorbars representing standard deviations, and lines 
from Eqs. (16. 6p . Other network parameters as in Fig. 16.11 Inset: fii against T 
for the assortative case (/3 = 0.5) and different system sizes: = 10^, 3 ■ 10^ 
and 5-101 

6.3.3 Assortativity and dynamics 

In Fig. 16.21 we plot the stationary value of fii against the temperature T, 
as obtained from simulations and Eqs. f l6.6p . for disassortative, neutral and 
assortative networks. The three curves are similar at low temperatures, but 
as T increases their behaviour becomes quite different. The disassortative 
network is the least robust to noise. However, the assortative one is capable 
of retaining some information at temperatures considerably higher than the 
critical value, Tc = {k'^)/{k), of neutral networks. A comparison between /ii 
and fiQ (see Fig. 16. 3p shows that it is the high degree nodes that are mainly 
responsible for this difference in performance. This can be seen more clearly in 
Fig. 16. 4( which displays the difference fii—fJ^o against T for the same networks. 
It seems that, because in an assortative network a sub-graph of hubs will 
have more edges than in a disassortative one, it has a higher effective critical 
temperature. Therefore, even when most of the nodes are acting randomly, 
the set of nodes of sufficiently high degree nevertheless displays associative 
memory. 
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Figure 6.3: Stable stationary values of order parameters /io, /ii and 
against temperature T, for assortative networks according to /3 = 0.5. Symbols 
from MC simulations, with errorbars representing standard deviations, and 
lines from Eqs. (16. 6p . Other parameters as in Fig. 16.11 




Figure 6.4: Difference between the stationary values /xi and /io for networks 
with (3 = —0.5 (disassortative), 0.0 (neutral) and 0.5 (assortative), against 
temperature. Symbols from MC simulations, with errorbars representing stan- 
dard deviations, and lines from Eqs. (16. 6p . Line shows the expected level of 
fluctuations due to noise, ~ N~^. Other parameters as in Fig. 16.11 
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The phase diagram if Fig. 16.51 shows the critical temperatur e, Tr„ as ob 



tained from Eq. (16. 7p . In addition to the effect reported by 



Torres et al. 



(j2004| ) whereby the of scale-free networks grows with degree heterogeneity 
(decreasing 7), it also increases very significantly with positive degree-degree 
correlations (increasing 

At large values of A^, the critical temperature scales as ~ A^'', with 6 > 
a constant. However, because the moments of k appearing in the coefficients of 
Eq. (16.71) can have different asymptotic behaviour depending on the values of 
7 and P, the scaling exponent b differs from one region to another in the space 
of these parameters. These are the seven regions shown in Fig. 16. 6^ along with 
the scaling behaviour exhibited by each one. This can be seen explicitely in 
Fig. 16. 7[ where Tc, as obtained from MC simulations, is plotted against A^ for 
cases in each of the regions with 7 < 3. In each case, the scaling is as given by 
Eq. (16. 7p and shown in Fig. 16. 6[ For the four regions with 7 < 3, from lowest 
to highest assortativity we have scaling exponents which are dependent on: 
only 7 (region I), only /3 (region II), both 7 and /3 (region III), and, perhaps 
most interestingly, neither of the two (region IV) - with Tc scaling, in the latter 
case, as As for the more homogeneous 7 > 3 part, regions V and VI have 
a diverging critical temperature despite the fact that the second moment of 
p{k) is finite, simply result of assortativity. 

The case in which more than one pattern are stored (P > 1) can be explored 
numerically. Assuming there are P uncorrelated patterns, we have an order 
parameter /x^' for each pattern u. A global measure of the degree to which 
there is memory can be captured by the parameter (, where 



- 1 + P/N 



Notice that the normalization factor is due to the fact that if one pattern is 
condensed- i.e., < 1 - the others have ~ 1/\/N, v = 2, ..P, and so C — 
1. Figure ES] shows how ( decreases with T in variously correlated networks 
for P = 3 (left panel) and P = 10 patterns (right panel). The behaviour is 
not qualitatively different from that observed for the single-pattern case in the 
main panel of Fig. 16.21 suggesting that the infiuence of assortativity we report 
is robust as to the number of patterns stored, P. 



6.3 Analysis and results 



75 




Figure 6.5: Phase diagrams for scale-free networks with 7 = 2.5, 3, and 3.5. 
Lines show the critical temperature Tc marking the second-order transition 
from a memory (ferromagnetic) phase to a memoryless (paramagnetic) one, 
against the assortativity /3, as given by Eq. (16. 7p . Other parameters as in Fig. 
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Figure 6.6: Parameter space /3 — 7 partitioned into the regions in which 6(/3, 7) 
has the same functional form - where b is the scaling exponent of the critical 
temperature: ~ N^. Exponents obtained by taking the large limit in Eq. 

(EH). 
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Figure 6.7: Examples of how scales with for networks belonging to 
regions I, II, III and IV of Fig. 16.61 {/3 = —0.8, —0.35, 0.0 and 0.9, respec- 
tively). Symbols from MC simulations, with errorbars representing standard 
deviations, and slopes from Eq. (16. 7p . All parameters - except for /3 and N - 
are as in Fig. 16.11 



6.4 Discussion 



We have shown that assortative networks of simple model neurons are able to 
exhibit associative memory in the presence of levels of noise such that uncor- 
related (or disassortative) networks cannot. This may appear to be in con- 
tradiction with a recent result obtained using spectral graph analysis - that 
synchroni zability of a set o f coupled oscillators is highest for disassortative 
networks (IBrede and Sinhal ). A synchronous state of model oscillators and a 
memory phase of model neurons are both sets of many simple dynamical el- 
ements coupled via a n etwork in such a way that a macroscopically coherent 
situation is maintained f Barahona and Pecoral . 2002). Obviously both systems 
require the effective transmission of information among the elements. So why 
are opposite results as regards the influence of topology reported for each sys- 
tem? The answer is simple: whereas the definition of a synchronous state is 
that every single element oscillate at the same frequency, it is precisely when 
most elements are actually behaving randomly that the advantages to assor- 
tativity we report become apparent. In fact, it can be seen in Fig. 16.21 that at 
low temperatures disassortative networks perform the best, although the ef- 
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Figure 6.8: Global order parameter ( for assortative {(3 = 0.5), neutral 
(/3 = 0.0) and disassortative (/3 = —0.5) networks with P = 3 (left panel) and 
P = 10 (right panel) stored patterns. Symbols from MC simulations, with 
errorbars representing standard deviations. All parameters are as in Fig. 16. 1[ 



feet is small. This is reminiscent of percolation: at high densities of edges the 
giant component is larger in disassortative networks, but in assortative ones a 
non- vanishing fraction of nodes r emain interconnected e ven at densities below 



2002, 



2003al ). Because in the case 



the usual percolation threshold fiNewmanl . 
of targeted attacks it is this threshold which is taken as a measure of resilience, 
we say that assortative networks perform the best. The relevance of partial 
synchronization and the important rol e of hubs have already been noted for 



syste ms of (weakly) coupled oscillators (IGomez-Gardenes et al. 



2007; 



Pereiral . 



20101) - fo r which , howe ver, assortativity has not been expected to be of con- 



sequence (jPereiral . l2010l ). In general, the optimal network for good conditions 
(i.e., complete synchronization, high density of edges, low levels of noise) is not 
necessarily the one which performs the best in bad conditions (partial synchro- 
nization, low density of edges, high levels of noise). It seems that optimality 
- whether in resilience or robustness - should thus be defined for particular 
conditions. 



We have used the technique suggested by 



Johnson et al. 



(12010bh to study 



the effect of correlations on networks of model neurons, but many other sys- 
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terns of dynamic al elements sho uld be susceptible to a similar treatment. I n 



fact, Ising spins flBianconil. 120021) . Voter Model agents ( ISuchecki et al. 



20051 



or Boolean nodes (jPeixotd . 120101 ). for instance, are similar enough to binary 
neurons that we should expect similar results for these models. If a moral can 
be drawn, it is that persistence of partial synchrony, or coherence of a subset of 
highly connected dynamical elements, can sometimes be as relevant (or more 
so) as the possibility of every element behaving in the same way. In the case of 
real brain cells, exper i ments suggest that hub neur ons play key functional roles 



2008; 



Bonifazi et al 



20091 ). From this point of view. 



([Morgan and Soltesz . 
there may be a selective pressure for brain networks to become assortative - 
although, admittedly, this organ engages in such complex behaviour that there 
must be many more functional constraints on its structure than just a high 
robustness to noise. Nevertheless, it would be interesting to investigate this as- 
pect of biological systems experimentally. For this, it should be borne in mind 
that heterogeneous networks have a natural tendency to become disassortative, 
so it is against the expected value of correlations discussed by I Johnson et al. 



(l2010bl ) that empirical data should be contrasted in order to look for meaning- 
ful deviations towards assortativity. Similarly, it may be necessary to take into 
account the correlations that could emerge due to the spatial layout of neurons 



( Kaiser et al. 



2007; 



Johnson et al. 



20111 ). In any case, it would be in areas of 
the cortex specifically related to memory - such as the temporal (long-term 
memory) ( Mivashitai 1988 ; Sakai and Miyashita . 1991) o r prefr ontal (short- 



1998b; 



Compte et al.l . l2003l ) lobes - that 



term memory) (ICamperi and Wang , 
this effect might be relevant. A curious fact that would seem to support our hy- 
pothesi s is that whereas the vast majority of non-social networks are disassor- 
tative (INewmanl . l2003d ). one that appears act ually to be strongly a ssortative 
is the functional network of the human cortex (lEgufluz et al.l . 120051 ) . 



Chapter 7 



Cluster Reverberation: A 
mechanism for robust 
short-term memory without 
synaptic learning 

Short-term memory cannot in general be explained the way long-term memory 
can - as a gradual modification of synaptic conductances - since it takes place 
too quickly. Theories based on some form of cellular bistability, however, do 
not seem to be able to account for the fact that noisy neurons can collectively 
store information in a robust manner. We show how a sufficiently clustered 
network of simple model neurons can be instantly induced into metastable 
states capable of retaining information for a short time. Cluster Reverbera- 
tion, as we call it, could constitute a viable mechanism available to the brain 
for robust short-term memory with no need of synaptic learning. Relevant phe- 
nomena described by neurobiology and psychology, such as power-law statistics 
of forgetting avalanches, emerge naturally from this mechanism. 



7.1 Slow but sure, or fast and fleeting? 



Of al l brain phenomena, memory is probably one of the bes t understood ( lAmit 



198 



Abbott and Kepler 



199G 



Torres and Varona 



20101 ). Consider a set of 



many neurons, defined as elements with two possible states (firing or not firing, 
one or zero) connected among each other in some way by synapses which carry 
a proportion of the current let off by a firing neuron to its neighbours; the 
probability that a given neuron has of firing at a certain time is then some 
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function of the total current it has just received. Such a simplified model of 
the brain is able to store and retrieve information, in the form of patterns of 
activity (i.e., particular configurations of firing and non-firing neurons) when 
the synaptic cond uctanc e s, or weights, have been appropriately set according 
to a learning rule f lHebbl . Il949[ ). Because each of the stored patterns becomes 
an attractor of the dynamics, the system will evolve towards whichever of the 
patterns most resembles the initial configuration. Artificial systems used for 
tasks such as pattern recognition and classification, as well as more realistic 
neural network models that take into account a variety of subcellular pr ocesses , 
all tend to rely on thi s basic mechanism, known as Associative Memory ( lAmari 



1972 



Hopfieldl . ll982h . 



Synaptic conductances in animal brains have indeed been found to be- 
come strengthened or weakened during learning, via the b iochemical processes 
of long-term potent i ation (LTP) and depression (LTD) (iMalenka and NicoU . 



1999 



2008 



Gruart et al 



2006 



Roo et al. 



2008 



Rodriguez-Moreno and Paulsen 



Kwag and Paulsen! . 120091 ). Further support for the hypothesis that such a 



mechanism underlies long-term memory (LTM) comes from psychology, where 
it is being found more and more t hat so-called connectionist models fit in well 



with observed brain phenomena ( iMarcus and G.F. 



2001 



Frank , 



3- 



19971). How- 



ever, some memory processes take place on timescales of sec onds or less and in 



many instances cannot be accounted for by LTP and LTD flDurstewitz et al 



1980 



2000f). since these require at le ast minutes to be effected fiLee et al 
Klintsova and Greenoughl . Il999l ). For example, Sperling found that visual 



stimuli are recalled in great detail fo r up to about one second after expo- 
sure (iconic memory) (jSperlingl . Il960l ): similarly, aco ustic in f ormat ion seems 
to linger for three or four seconds (echoic memory) ( ICowanl . Il984l ). In fact, 
it appears that the brain actually holds and continually updates a kind of 
buffer in which sen sory infor r nation regarding its surroundings is maintained 
(sensory memory) f lBaddeleyl . Il999l ) . This is easily observed by simply closing 
one's eyes and recalling what was last seen, or thinking about a sound after it 
has finished. Another instance is the capability referre d to as working mem- 



ory f lDurstewitz et al. 



2000 



Baddeley and A.D.I . |2003| ): just as a computer 



requires RAM for its calculations despite having a hard drive for long term 
storage, the brain must continually store and delete information to perform 
almost any cognitive task. To some extent, working memory could consist in 
somehow labelling or bringing forward previously stored concepts, like when 
one is asked to remember a particular sequence of digits or familiar shapes. 
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But we are also able to manipulate - if perhaps not quite so well - shapes and 
symbols we have only just become acquainted with, too recently for them to 
have been learned synaptically. We shall here use short-term memory (STM) 
to describe the brain's ability to store information on a timescale of seconds 
or lesi^. 

Evidence that short-term memory is related to sensory information while 
long-term memory is more conceptual can again be found in psychology. For 
instance, a sequence of similar sounding letters is more difficult to retain for 
a short time than one of phonetically distinct ones, while this has no bear- 
i ng on long-term memory, for which semantics seems to play the main role 
f IConradl . ll964al Jbl): and the way many of us think about certain concepts, 
such as chess, geometry or music, is apparently quite sensorial: we imagine 
positions, surfaces or notes as they would look or sound. Most theories of 
short-term memory - which almost always focus on working memory - make 
use of some form of previously stored information (i.e., of synaptic learning) 
and so can account for the labelling tasks referred to above but not f or the 
instant recall of novel information (IWangl. 



Roudi and Latham: 



Mongillo et al. 



2008 



2001 



Barak and Tsodvkd. 



2007; 



Mejias and Torresl . l2009l ). Attempts 



to deal with the latter have been made by proposing mechanisms of cellular 
bistability: neurons are assumed to retain the state the y are placed in (such 
as firing or not firing) for some period of t i me th ereafter ( ICamperi and Wangl . 



1998a: 



Teramae and Fukai. 



2005 



Tarnow 



20081 ). Although there may indeed 



be subcellular processes leading to a certain bistability, the main problem with 
short-term memory depending exclusively on such a mechanism is that if each 
neuron must act ind ependently of the rest t he patterns will not be robust to 
random fluctuations (Durstewitz et al., 20001) - and the behaviour of individual 



neurons is known to be quite noisy fICompte et al.l . l2003l ). It is worth pointing 
out that one of the strengths of Associative Memory is that the behaviour 
of a given neuron depends on many neighbours and not just on itself, which 
means that robust global recall can emerge despite random fluctuations at an 



^We should mention that sensory memory is usually considered distinct from STM - and 
probably has a different origin - but we shall use "short-term memory" generically since 
the mechanism we propose in this paper could be relevant for either or both phenomena. 
On the other hand, the recent flurry of research in psychology and neuroscience on working 
memory has lead to this term sometimes being used to mean short-term memory; strictly 
speaking, however, working memory is generally considered to be an aspect of cognition 
which operates on information stored in STM. 
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individual level. 

Something that, at least until recently, most neural network models have 
failed to take into account is the structure of the network - its topology - it 
often being assumed that synapses are placed among the neurons completely 
at random, or even that all neurons are connected to all the rest (a mathe- 
matically convenient but unrealistic situation). Although relatively little is yet 
known about the architecture of the brain at the level of neurons and synapses, 
experiments have shown that it is heterogeneous (some neurons have very many 
more synapses than others), clustered (two neurons have a higher chance of be- 
ing connected if they share neighbours than if not) and highly modular (there 
are groups, or modul es, with neurons forming synapses preferen tially to those 



in the same module) (ISporns et al. 



2004 



Johnson et a. 



describes the main result of Ref. (jjohnson et al.l . 1201 ll ) - namely, that it suf 



2010al ). This chapter 



fices to use a more realistic topology, in particular one which is modular and/or 
clustered, for a randomly chosen pattern of activity the system is placed in to 
be metastable. This means that novel information can be instantly stored and 
retained for a short period of time in the absence of both synaptic learning and 
cellular bistability. The only requisite is that the patterns be coarse grained 
versions of the usual patterns - that is, whereas it is often assumed that each 
neuron in some way represents one bit of information, we shall allocate a bit 
to a small group or neuronJ§ (four or five can be enough). 

The mechanism, which we call Cluster Reverberation, is very simple. If 
neurons in a group are more highly connected to each other than to the rest 
of the network, either because they form a module or because the network 
is significantly clustered, they will tend to retain the activity of the group: 
when they are all initially firing, they each continue to receive many action 
potentials and so go on firing, whereas if they start off silent, there is not 
usually enough input current from the outside to set them off. The fact that 
each neuron's state depends on its neighbours conferres to the mechanism a 
certain robustness in the face of random fluctuations. This robustness is par- 
ticularly important for biological neurons, which as mentioned are quite noisy. 
Furthermore, not only does the limited duration of short-term memory states 
emerge naturally from this mechanism (even in the absence of interference 
from new stimuli) but this natural forgetting follows power-law statistics, as 



^This does not, of course, mean that memories are expected to be encoded as bitmaps. 
Just as with individual neurons, positions or orientations, say, could be represented by the 
activation of particular sets of clusters. 
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in experimental settings (jWixted and Ebbeseru . 



1991 



1997 



Sikstrom 



20021 ). 



Th e process is reminiscent both of block attractors in ordinary ne ural net- 



work s ( iDominguez et al.l . 120091 ) and of domains in magnetic materials (lA. and R. 



19981 ). while Munoz et al. have recently highlighted a similarity with Griffiths 



phases on networks flMunoz et al.l . l2010l ). It can also be interpreted as a multi- 
scale phenomenon: the mesoscopic clusters take on the role usually played by 
individual neurons, yet make use of network properties. Although the mecha- 
nism could also work in conjunction with other ones, such as synaptic learning 
or cellular bistability, we shall illustrate it by considering the simplest model 
which has the necessary ingredients: a set of binary neurons linked by synapses 
of uniform weight according to a topology whose modularity or clustering we 
shall tune. As with Associative Memory, this mechanism of Cluster Reverber- 
ation appears to be simple and robust enough not to be qualitatively affected 
by the complex subcellular processes incorporated into more realistic neuron 
models - such as integrate-and-fire or Hodgkin-Huxley neurons. However, such 
refinements are probably needed to achieve graded persistent activity, since the 
mean frequency of each cluster could then be set to a certain value. 



7.2 The simplest neurons on modular networks 

We consider a network of model neurons, with activities = ±1. The topol- 
ogy is given by the adjacency matrix dij = {1,0}, each element representing 
the existence or absence of a synapse from neuron j to neuron i [a need not 
be symmetric). In this kind of model, each edge usually has a synaptic weight 
associated, Uij G M; however, we shall here consider these to have all the same 
value: Uij = u \/i,j. Neurons are updated in parallel (Little dynamics) at each 
time step, according to the stochastic transition rule 

1 , ^h^\ 1 
P(s,^±l) = ±-tanh (^^J + 

where the field of neuron i is defined as 

N 

hi = oj ^ ^ ciijSj 
j 

and T is a parameter we shall call temperature. 

First of all, we shall consider the network defined by a to be made up 
of M distinct modules. To achieve this, we can first construct M separate 
random directed networks, each with n = N/M nodes and mean degree (mean 
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number of neighbours) {k). Then we evaluate each edge and, with probability 
A, eliminate it, to be substituted for another edge between the original post- 
synaptic neuron and a new pre-synaptic neuron chosen at random from among 
any of those in other module^f]. Note that this protocol does not alter the 
number of pre-synaptic neighbours of each node, fc-" = ^ij (although the 
number of post-synaptic neurons, = ^ • vary). The parameter A 

can be seen as a measure of modularity of the partition considered, since it 
coincides with the expected value of the proportion of edges that link different 
modules. In particular, A = defines a network of disconnected modules, 
while A = 1 — M^^ yields a random network in which this partition has no 
modularity. If A G (1 — M~^, 1), the partition is less than randomly modular 
- i.e., it is quasi-multipartite (or multipartite if A = 1). 

If the size of the modules is of the order of (k), the network will also be 
highly clustered. Taking into account that the network is directed, let us 
define the clustering coefficient Ci as the probability, given that there is a 
synapse from neuron z to a neuron j and from another neuron I to i, that 
there be a synapse from j to I: that is, that there exist a feedback loop 
i ^ j ^ I ^ i. Then, assuming M ':$> 1, the expected value of the clustering 
coefficient C = (Ci) is 

C^>^(l-A)l 
n — 1 

7.3 Cluster Reverberation 

A memory pattern, in the form of a given configuration of activities, {^j = 
±1}, can be stored in this system with no need of prior learning. Imagine a 
pattern such that the activities of all n neurons found in any module are the 
same, i.e., = where the index fi{i) denotes the module that neuron i 
belongs to. This can be thought of as a coarse graining of the standard idea of 
memory patterns, in which each neuron represents one bit of information. In 
our scheme, each module represents - and stores - one bit. The system can be 
induced into this configuration via the application of an appropriate stimulus 
(see Fig. 17. ip : the field of each neuron will be altered for just one time step 
according to 

hi ^ hi + 5i^{i), Vz, 



■^We do not allow self-edges (although these can occur in reality) since they can be 
regarded as a form of cellular bistability. 
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Figure 7.1: Diagram of a modular network composed of four five-neuron clus- 
ters. The four circles enclosed by the dashed line represent the stimulus: each 
is connected to a particular module, which adopts the input state (red or blue) 
and retains it after the stimulus has disappeared via Cluster Reverberation. 

where the factor 6 is the intensity of the stimulus. This mechanism for dy- 
namically storing information will work for values of parameters such that the 
system is sensitive to the stimulus, acquiring the desired configuration, yet also 
able to retain it for some interval of time thereafter. 

The two main attractors of the system are Sj = 1 Vi and Sj = — 1 Vi. These 
are the configurations of minimum energy (see the next section for a more 
detailed discussion on energy). However, the energy is locally minimised for 
any configuration in which Sj = Vi with = ±1; that is, configurations 
such that each module comprises either all active or all inactive neurons. These 
are the configurations that we shall use to store information. We define the 
mean activitjQ of each module, 




^The mean activity in a neural network model is usually taken to represent the mean 
firing rate measured in experiments ([Torres and Varonal 120101 ) . 
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which is a mesoscopic variable, as well as the global mean activity, 

i pi 

(these magnitudes change with time, but, where possible, we shall avoid writ- 
ing the time dependence explicitely for clarity). The extent to which the 
network, at a given time, retains the pattern with which it was stimulated 
is measured with the overlap parameter 



1 ^ ^ M 



Ideally, the system should be capable of reacting immediately to a stimulus by 
adopting the right configuration, yet also be able to retain it for long enough 
to use the information once the stimulus has disappeared. A measure of per- 
formance for such a task is therefore 



T 



t=to + l 

where to is the time at which the sti mulus is received and r is the period of 



time we are interested in (Ir^l < 1) (jjohnson et al. 



20081 ). If the intensity 



of the stimulus, 6, is very large, then the system will always adopt the right 
pattern perfectly and t] will only depend on how well it can then retain it. 
In this case, the best network will be one that is made up of unconnected 
modules. However, since the stimulus in a real brain can be expected to arrive 
via a relatively small number of axons, either from another part of the brain or 
directly from sensory cells, it is more realistic to assume that 5 is of a similar 
order as the input a typical neuron receives from its neighbours, (h) ~ uj{k). 

Fig. 17. 21 shows the mean performance obtained when the network is repeat- 
edly stimulated with different randomly generated patterns. For low enough 
values of the modularity A and stimuli of intensity 6 > the system can 

capture and successfully retain any pattern it is "shown" for some period of 
time, even though this pattern was in no way previously learned. For less in- 
tense stimuli [5 < uj{k)), performance is nonmonotonic with modularity: there 
exists an optimal value of A at which the system is sensitive to stimuli yet still 
able to retain new patterns quite well. 

It is worth noting that performance can also break down due to thermal 
fluctuations. The two main attractors of the system (sj = 1 and Sj = — 1 Vi) 
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Figure 7.2: Performance rj against A for networks of the sort described in the 
main text with M = 160 modules of = 10 neurons, (k) = 9; patterns are 
shown with intensities 6 = 8.5, 9 and 10, and T = 0.02 (lines - splines - are 
drawn as a guide to the eye). Inset: typical time series of nisum (i-e., the 
overlap with whichever pattern was last shown) for A = 0.0, 0.25, and 0.5, and 
S={k) = 9. 



suffer the typical second order phase transition of the Hopfield model (IHopfieldl . 



19821 ). from a memory phase (one in which m = is not stable and stable 



solutions m 7^ exist) to one with n o memory (with m = the only stable 



solution), at the critical temperature fjjohnson et al.l . l2008l ) 



' - w 

(Note that, in a directed network, {kin) = {kout) = (k), although the other 
moments can in general be different.) The metastable states we are interested 
in, though, have a critical temperature 

n = (1 - A)r, 

(assuming that the mean activity of the network is m ~ 0). That is, the 
temperature at which the modules are no longer able to retain their individual 
activity is in general lower than that at which the the solution m = for the 
whole network becomes stable. 
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7.4 Energy and topology 

Each pair of nodes contributes a configurational energy e^j = —LL!^{dij+aji)siSj; 
that is, if there is an edge from i to j and they have opposite activities, the 
energy is increased in ^u, whereas it is decreased by the same amount if their 
activities are the same. Given a configuration, we can obtain its associated 
energy by summing over all pairs. We shall be interested in configurations with 
X neurons that have s = +1 (and N — x with s = —1), chosen in such a way 
that one module at most, say fi, has neurons in both states simultaneously. 
Therefore, x = np + z, where p is the number of modules with all their neu- 
rons in the positive state and z is the number of neurons with positive sign in 
module /i. We can write m = (2x — 1)/N and = {2z — l)/n. The total con- 
figurational energy of the system will be = Cij = ^uj{L^i — {k)N), where 
L^l is the number of edges linking nodes with opposite activities. By simply 
counting over edges, we can obtain the expected value of (which amounts 
to a mean-field approximation because we are substituting the number of edges 
between two neurons for its expected value), yielding: 

E _ q_^^|5(^-2;) 



u{k) n ~ 1 

+ iV~I^ {pN -z + n{M - p-l)] + {M - p-l){z + np)} - ^N. (7.1) 

Fig. 17.31 shows the mean-field configurational energy curves for various values 
of the modularity on a small modular network. The local minima (metastable 
states) are the configurations used to store patterns. It should be noted that 
the mapping x — )■ m is highly degenerate: there are C^j^^ patterns with mean 
activity m that all have the same energy. 



7.5 Forgetting avalanches 

In obtaining the energy we have assumed that the number of synapses rewired 
from a given module is always u = {k)nX. However, since each edge is evaluated 
with probability A, u will in fact vary somewhat from one module to another, 
being approximately Poisson distributed with mean (u) = {k)nX. The depth 
of the energy well corresponding to a given module is then, neglecting all but 
the first term in Eq. (17. ip and approximating — 1 ~ ra, 

AE ~ ^uj{n{k) - v). 
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Figure 7.3: Configurational energy of a network composed of M = 20 modules 
of n = 10 neurons each, according to Eq. (17. ip . for various values of the 
rewiring probability A. The minima correspond to situations such that all 
neurons within any given module have the same sign. 



The typical escape time r from an e nergy well of depth AE at temperature 
T is r ~ e^^/"^ f Levine and R.D.I . 12005 ). Using Stirling's approximation in the 
Poissonian distribution of v and expressing it in terms of r, we find that the 
escape times are distributed according to 



P{ 



T ~ 



4T 

ujn{k) 



In ' 



where 



/3(r) 



1 + 



AT 

ujn{k) 



1 + ln 



\n{k) 



1 - 



4r 

um{k) 



In r 



(7.2) 



(7.3) 



Therefore, at low temperatures, -P(t) will behave approximately like a power- 
law. The left panel of Fig. 17.41 shows the distribution of time intervals between 
events in which the overlap of at least one module /i changes sign. The 
power-law-like behaviour is apparent, and justifies talking about forgetting 
avalanches - since there are cascades of many forgetting events interspersed 
with long periods of met ast ability. This is very similar to the behaviour ob- 
served in other nonequilibrium se ttings in which power-law statistics arise from 



the convolution of exponentials (iHurtado et al.l . 12008 



Munoz et al. 



201 



3). 



It is known from experimental psychology that forgetting i n humans is in- 



deed well described by power-laws (jWixted and Ebbesenl . ll991 



1997 



Sikstroml . 



20021 ). The right panel of Fig. 17.41 shows the value of the exponent (3{t) as 
a function of r. Although for low temperatures it is almost constant over 
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Figure 7.4: Left panel: distribution of escape times r, as defined in the main 
text, for A = 0.25 and T = 2. Slope is for /3 = 1.35. Other parameters as in 
Fig. 17.21 Symbols from MC simulations and line given by Eqs. (17. 2p and f l7.3p . 
Right panel: exponent /3 of the quasi-power-law distribution p(r) as given by 
Eq. (17.31) for temperatures T = 1 (red line), T = 2 (green line) and T = 3 
(blue line). 



many decades of r - approximating a pure power-law - for any finite T there 
will always be a r such that the denominator in the logarithm of Eq. (17.31) 
approaches zero and /3 diverges, signifying a truncation of the distribution. 



7.6 Clustered networks 

Although we have illustrated how the mechanism of Cluster Reverberation 
works on a modular network, it is not actually necessary for the topology to 
have this characteristic - only for the patterns to be in some way "coarse- 
grained," as described, and that each region of the network encoding one bit 
have a small enough parameter A, defined as the proportion of synapses to 
other regions. For instance , for the famous Watts-Strogatz small-world model 



flWatts and Strogatzl . Il998l ) - a ring of N nodes, each initially connected to its 
k nearest neighbours before a proportion p of the edges are randomly rewired 
- we have A ~ p (which is not surprising considering the resemblance between 
this model and the modular network used above). More precisely, the expected 
modularity of a randomly imposed box of n neurons is 

n — 1 1 — p f k 1 
^ = P — 77 tP + 



N-r n V4 2 
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the second term on the right accounting for the edges rewired to the same box, 
and the third to the edges not rewired but sufficiently close to the border to 
connect with a different box. 

Perhaps a more realistic model of clustered network would be a random 
network embedded in (i-dimensional E uclidean space. For this we shall use the 
scheme laid out by Rozenfeld et al. fiRozenfeld et al.l . l2002l ). which consists 
simply in allocating each node to a site on a (i-torus and then, given a par- 
ticular degree sequence, placing edges to the nearest nodes possible - thereby 
attempting to minimise total edge lengtljf]. For a scale-free degree sequence 
(i.e., a set {ki} drawn from a degree distribution p{k) ~ k~^) according to 
some exponent 7, then, as shown in [HI such a network has a modularity 

1 



A 



^ [d(7 - 2)1-' - , (7.4) 



dil - 2) 

where / is the linear size of the boxes considered. 

Fig. 17.51 compares this expression with the value obtained numerically after 
averaging over many network realizations, and shows that it is fairly good - 
considering the approximations used for its derivation. It is interesting that 
even in this scenario, where the boxes of neurons which are to receive the 
same stimulus are chosen at random with no consideration for the underlying 
topology, these boxes need not have very many neurons for A to be quite low 
(as long as the degree distribution is not too heterogeneous). 

Carrying out the same repeated stimulation test as on the modular net- 
works in Fig. \7.2\ we find a similar behaviour for the scale-free embedded 
networks. This is shown in Fig. 17. 6[ where for high enough intensity of stim- 
uli 6 and scale-free exponent 7, performance can, as in the modular case, be 
?7 ~ 1. We should point out that for good performance on these networks we 
require more neurons for each bit of information than on modular networks 



with the same A (in Fig. 17.61 we use n = 100, as opposed to n = 10 in Fig. 
17. 2p . However, that we should be able to obtain good results for such diverse 
network topologies underlines that the mechanism of Cluster Reverberation 
is robust and not dependent on some very specific architecture. In fact, we 
have recently shown that similar metastable memory states can also occur on 
networks which have random modularity a nd clustering, but a certain degree 
of assortativit'^ (jde Franciscis et aD . 2011 ). 



^The authors also consider a cutoff distance, but we shall take this to be infinite here. 
^The assortativity of a network is here u nderstood to mean th e extent to which the 
degrees of neighbouring nodes are correlated ([Johnson et al.l . l2010bl ). 
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Figure 7.5: Proportion of outgoing edges, A, from boxes of linear size / against 
exponent 7 for scale-free networks embedded on 2D lattices. Lines from Eq. 
fl7.4p and symbols from simulations with (k) = 4 and = 1600. 
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Figure 7.6: Performance rj against exponent 7 for scale-free networks, embed- 
ded on a 2D lattice, with patterns of M = 16 modules of n = 100 neurons 
each, (k) = 4 and = 1600; patterns are shown with intensities 6 = 3.5, 4, 5 
and 10, and T = 0.01 (lines - splines - are drawn as a guide to the eye). Inset: 
typical time series for 7 = 2, 3, and 4, with 6 = 5. 
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7.7 Yes, but does it happen in the brain? 

As we have shown, Cluster Reverberation is a mechanism available to neural 
systems for robust short-term memory without synaptic learning. To the best 
of our knowledge, this is the first mechanism proposed which has these charac- 
teristics - essential for, say, sensory memory or certain working-memory tasks. 
All that is needed is for the network topology to be highly clustered or modu- 
lar, and for small groups of neurons to store one bit of information, as opposed 
to the conventional view which assumes one bit per neuron. Considering the 
enormous number of neurons in the brain, and the fact that real individual 
neurons are probably too noisy to store information reliably, these hypotheses 
do not seem farfetched. The mechanism is furthermore consistent both with 
what is known about the topology of the brain, and with experiments which 
have revealed power-law forgetting. 

Since the purpose of this paper is only to describe the mechanism of Clus- 
ter Reverberation, we have made use of the simplest possible model neurons - 
namely, binary neurons with static, uniform synapses - for the sake of clarity 
and generality. However, there is no reason to believe that the mechanism 
would cease to function if more neuronal ingredients were to be incorporated. 
In fact, cellular bistability, for instance, would increase performance, and the 
two mechanisms could actually work in conjunction. Similarly, we have also 
used binary patterns to store information. But it is to be expected that pat- 
terns depending on any form of frequency coding, for instance, could also 
be maintained with more sophisticated neurons - such that different modules 
could be set to different mean frequencies. 

Whether Cluster Reverberation would work for biological neural systems 
could be put to the test by growing such modular networks in vitro, stimulating 
appropriately, and observing the duration of the metastable states. In vivo 
recordings of neural activity during short-term memory tasks, together with 
a mapping of the underlying synaptic connections, could be used to ascertain 
whether the brain does indeed make use of this mechanism - although for this 
it must be borne in mind that the neurons forming a module need not find 
themselves close together in metric space. We hope that experiments such as 
these will be carried out and eventually reveal something more about the basis 
of this puzzling emergent property of the brain's known as thought. 
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Concluding remarks 



"As long as the brain is a mystery, the universe will remain a mystery,'' claimed 
Santiago Ramon y Cajal. Our very essence seems to reside somehow in the 
workings of this organ, probably as a consequence of electro-chemical signalling 
that goes on among its hundred billion or so constituent neurons. Will this 
mystery ever be cleared up? We know of other objects that process informa- 
tion in highly sophisticated ways - electronic computers. Faced with a sudden 
blue screen, one may be forgiven for calling these devices incomprehensible and 
capricious, even malevolent. But in fact most educated people understand, on 
some level at least, what mechanisms and physical processes are behind the 
complex behaviour displayed by computers, and do not consider the issue a 
mystery. This is not to suggest that the analogy between brain and computer 
should be taken any further than to illustrate how a great many elements, each 
executing some fairly simple and obvious operation, can "cooperate" to yield 
astonishingly complicated yet functional behaviour; and that one can grasp 
how this occurs without having to know every detail. But we have not yet 
reached this point as regards the brain. Much progress has been made con- 
cerning aspects of physiology, while once unassailable mental disorders such 
as phobias can now be easily cured by psychology. Yet as far as what mecha- 
nisms relate these two levels of description goes, perhaps all we can safely say 
for now is that synaptic plasticity is responsible for long-term memory. The 
origins of even some well-defined and much studied cognitive abilities - such 
as probabilistic reasoning or short-term memory - remain somewhat elusive, 
while the nature of consciousness, say, is still truly a mystery. However, if 
instead of developing computers ourselves wc had been given them by an alien 
species, we could still hope one day to unravel the mysteries of their magic. In 
much the same manner, by searching for ways in which collections of neurons 
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might perform tasks such as we know them to be capable of, we will some day 
understand not only how our stomachs digest and our hearts pump, but also 
how our brains think. 

I cannot pretend that the work described here takes us more than, at best, 
a tiny step of the way along this path. The brain is, among other things, 
a network, and networks are a kind of mathematical object about which we 
now know much more than just a few years ago. In fact, they are a central 
element of what can arguably be called the most challenging frontier currently 
facing human understanding about the world - the nature of complex systems. 
So, from among the innumerable aspects likely to shape and determine the 
way neurons cooperate, the research presented here focuses on the structure 
of the underlying network. First of all it looks at how this structure can 
develop. Chapter [3] addressed this by formalizing as a stochastic process a 
situation governed by probabilistic events like synaptic growth and death. Such 
simple individual behaviour was shown to be enough to explain many statistical 
features of real neural systems. Furthermore, this Fokker-Planck description 
relating microscopic, stochastic actions to a macroscopic evolution of properties 
such as mean synaptic density, degree heterogeneity or assortativity may help 
to gain insights into the biochemical processes taking place. 

The rest of the thesis is mostly devoted to how aspects of a neural network's 
topology might influence or even determine its ability to carry out certain tasks 
akin to those the brain undertakes. The fact that dynamical memory perfor- 
mance ensuing from synaptic depression is favoured by a highly heterogeneous 
degree distribution, laid out in Chapter HI may help to explain why the brain 
seems to display such a topology at several levels of description - perhaps 
somehow maintaining itself close to a critical point. Similarly, the enhanced 
robustness to noise found for positively correlated networks in Chapter [6] sug- 
gests a functional advantage to a neural network being thus wired; a prediction 
also in agreement with some experimental flndings. 

As far as unearthing the mechanisms underpinning how neurons can per- 
form cognitive tasks goes, though, perhaps the most interesting idea proposed 
is that of Cluster Reverberation, in Chapter [TJ whereby thanks to modularity 
and/or clustering a neural network is able to store information instantly, with- 
out requiring biochemical changes in the synapses. Time will tell whether real 
neural systems do indeed harness this mechanism to perform certain short-term 
memory tasks. 

A collateral but noteworthy aspect of this research is the potentiality for 
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application elsewhere of some of the mathematical techniques developed. Most 
of all, the method for studying correlated networks and dynamics thereon put 
forward in Chapter \5\ for use in Chapter E] can be expected to find widespread 
use. The answer to the question of why most networks are disassortative given 
in Chapter |5l or the relation between degree-degree correlations and nestedness 
described in Appendix [C] are examples of this. 

Finally 1 must mention not just the answers I hope to have provided, or at 
least hinted at, to some unsolved problems, but also the questions that have 
been posed and challenges laid bare: Would a more detailed description of 
brain development still be possible with Fokker-Planck equations? Are these 
topological effects, found to be at work for the simplest neural models, indeed 
so relevant for real neurons? Can Cluster Reverberation be performed in vitrol 
The greatest function this thesis could perform would be to stimulate others 
to look into these or related issues in more depth than here. But 1 also hope 
it may serve to illustrate the sentiment. What matters how long the path to 
the final unravelling of the mysteries is, as long as the going is fun? 



Chapter 9 

Conclusiones en espanol 



"Mientras cl ccrcbro sea un misterio, el universo continuara siendo un miste- 
rio", dijo una vez Santiago Ramon y Cajal. Parece que nuestra misma esencia 
reside de alguna manera en el funcionamiento de este organo, probablemente 
como consecuencia de las senales electro-quimicas entre sus aproximadamente 
cien mil millones de neuronas. ^Se resolvera algun dia este misterio? Conoce- 
mos otros objetos capaces de procesar informacion de manera altamente sofisti- 
cada: los ordenadores electronicos. Confrontados con un pantallazo azul, se 
nos podria perdonar el tildar este tipo de aparatos de incomprensibles y capri- 
chosos, por no decir malevolos. Pero en realidad la mayor parte de la gente 
entiende, al menos en algun nivel, cuales son los mecanismos y procesos fisicos 
que subyacen el comportamiento complejo del que hacen gala los ordenadores, 
y no consider an que el tema sea un misterio. No es que la analogia entre 
cerebro y ordenador deba ser Uevado mas lejos que para ilustrar como muchos 
elementos, cada uno ejecutando alguna operacion relativamente simple y obvia, 
pueden "cooperar" y mostrar un comportamiento colectivo asombrosamente 
complicado, pero funcional; y que se puede comprender como ocurre esto sin 
necesidad de conocer hasta el ultimo detalle. Aiin no hemes llegado a poder 
responder a esta pregunta en lo que respecta al cerebro. Hemes ampliado 
enormemente nuestro conocimiento de aspectos fisiologicos, y trastornos men- 
tales antaiio incurables, como las fobias, son facilmente tratadas hoy en di'a 
por la psicologia. En cuanto a los mecanismos que relacionan estos dos niveles 
de descripcion, posiblemente lo unico que podamos decir a ciencia cierta es que 
la plasticidad sinaptica esta detras de la memoria a largo plazo. Los origenes 
incluso de algunas habilidades cognitivas bien definidas y extensamente estu- 
diadas, como el razonamiento probabihstico o la memoria a corto plazo, estan 
aun por descifrar completamente; mientras que, por ejemplo, la naturaleza de 
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la consciencia es verdaderamente aiin un misterio. Sin embargo, si en lugar de 
haber desarrollado los ordenadores nosotros mismos los hubiesemos recibido de 
una especie alieni'gena, aiin asi podrfamos esperar algiin di'a desenmaraiiar los 
misterios de su magia. Del mismo modo, buscando maneras de que conjuntos 
de neuronas puedan realizar el tipo de tareas de las que las sabemos capaces, 
algun dia entenderemos no solo como nuestros estomagos digieren y nuestros 
corazones laten, sino tambien como nuestros cerebros piensan. 

Este trabajo, en el mejor de los casos, nos avanza un paso infinitesimal por 
este camino. El cerebro es, entre otras muchas cosas, una red, y las redes son 
objetos matematicos sobre los que sabemos hoy mucho mas que hace tan solo 
unos aiios. De hecho, son un elemento fundamental para uno de los mayores re- 
tos con los que se enfrenta actualmente el conocimiento liumano: la naturaleza 
de los sistemas complejos. Asi que, de entre los innumerables aspectos suscepti- 
bles de modificar y determinar como las neuronas cooperan, esta investigacion 
se centra en la estructura de la red subyacente. Primero analiza como dicha 
estructura puede desarroUarse. El Capitulo Oenfoca esto formalizando medi- 
ante la teoria de los procesos estocasticos una situacion gobernada por eventos 
probabilfsticos tales como el crecimiento y la muerte sinapticas. Se demuestra 
que este tipo de comportamiento individual es suficiente para explicar muchas 
propiedades estadfsticas de las redes de cerebros reales. Por otra parte, este 
marco teorico puede ser reducido a una descripcion en terminos de ecuaciones 
de Fokker-Planck, que relacionan acciones microscopicas estocasticas con la 
evolucion macroscpica de propiedades como la densidad sinaptica media, la 
heterogeneidad de la distribucion de grados o la asortatividad, que quizas nos 
permita extraer informacion relevante acerca de los procesos bioqui'micos in- 
volucrados. 

La mayor parte del resto de la tesis trata de como aspectos de la topologfa 
de una red neuronal pueden influenciar o incluso determinar su habilidad para 
ejecutar ciertas tareas cognitivas como las que se describen en un cerebro o 
medio neuronal real. Por ejemplo, el hecho de que, en cuanto a la memoria 
dinamica que emerge gracias a la depresion sinaptica, el rendimiento es mayor 
para una distribucion de grados altamente heterogenea, como demuestra el 
Capltulo m podria ayudar a explicar por que el cerebro parece mostrar una 
topologfa de este tipo en varios niveles de descripcion, quizas incluso man- 
teniendo su actividad, de alguna manera todavfa no comprendida del todo, 
cerca de un punto crftico. De igual modo, la mayor robustez durante los pro- 
cesos cognitivos en presencia de ruido en el caso de redes con correlaciones 
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positivas como se ha descrito en el Capi'tulo [6] sugiere que existe una ventaja 
funcional para una red neuronal en adoptar esta propiedad; una prediccion 
que tambien encaja con algunos hallazgos experimentales. 

En lo que se refiere a desentranar los mecanismos que permiten a las 
neuronas realizar colectivamente tareas cognitivas, quizas la idea mas intere- 
sante aqm propuesta es la de Cluster Reverberation (Reverberacion de Grupo), 
en el Capitulo [71 segun la cual, gracias a la modularidad y/o el grado de 
"agrupamiento" , una red neuronal es capaz de almacenar informacion in- 
stantaneamente, sin requerir para ello cambios bioquimicos de potenciacion o 
depresion a largo plazo en las sinapsis. El tiempo dira si el cerebro aprovecha 
realmente este mecanismo para realizar ciertas tareas de memoria de corto 
plazo. 

Un aspecto colateral pero digno de mencion de este trabajo es el de la 
potencialidad de algunas de las tecnicas matematicas desarroUadas de ser apli- 
cadas para otras situaciones de interes. Sobre todo, es de esperar que el metodo 
para estudiar redes correlacionadas, y dinamicas sobre ellas, propuesto en el 
Capitulo Oy utilizado en el Capitulo O sea litil para una amplia gama de prob- 
lemas. La respuesta, en el Capitulo [5l a la pregunta de por que la mayorfa de 
las redes son disasortativas, o la relacion entre correlaciones entre los nodos y 
el "anidamiento" descrita en el Apendice [C] son ejemplos de aplicaciones. 

Finalmente, hay que mencionar no solo las respuestas que se han intentado 
dar, o al menos sugerir, con esta tesis para algunos problemas sin resolver, 
sino tambien las preguntas y los nuevos retos que han surgido: por ejemplo, 
^serfa posible, tambien con ecuaciones de Fokker-Planck, una descripcion mas 
detallada del desarrollo cerebral? iSon estos efectos topologicos, descritos para 
los modelos neuronales mas sencillos, realmente tan relevantes para neuronas 
de verdad? /^Puede el mecanismo de Cluster Reverberation ocurrir in vitrol En 
definitiva, la mayor funcion que pudiera cumplir esta tesis seri'a la de estimular 
a otra/os para que indaguen en estos y otros temas mas profundamente que 
aqui. Pero quizas tambien sirva para ilustrar el siguiente sentimiento: /,que 
mas da cuan largo sea el camino hacia el desenmaranamiento ultimo de los 
misterios, siempre que el trayecto sea divertido? 



Appendix A 

Nonlinear preferential rewiring 
in fixed-size networks as a 
diffusion process 



We present an evolving network model in which the total numbers of nodes 
and edges are conserved, but in which edges are continuously rewired according 
to nonlinear preferential detachment and reattachment. Assuming power-law 
kernels with exponents a and /3, the stationary states the degree distribu- 
tions evolve towards exhibit a second order phase transition - from relatively 
homogeneous to highly heterogeneous (with the emergence of starlike struc- 
tures) at a = /3. Temporal evolution of the distribution in this critical regime 
is shown to follow a nonlinear diffusion equation, arriving at either pure or 
mixed power-laws, of exponents —a and 1 — a. 

Complex systems may often be described as a set of nodes with edges con- 



necting some of them - the neighbours - (see, f or instance, Refs. (IBoccaletti et al. 



2006 



Arenas et al. 



2008a: 



Marro et al. 



20081 )). The number of edges a partic- 



ular node has is called its degree, k. The study of such large networks is usually 
made simpler by considering statistical properties, e.g., the degree distribution, 
p{k) (probability of finding a node with a particular degree). It turns out that 
a high proportion of real- world networks follow power-law degree distributions, 
p{k) ~ k~"' - referred to as scale-free due to their lack of a characteristic size. 
Also, many of them have their edges placed among the nodes apparently in a 
random way - i.e., there is no correlation between the degree of a node and 
any other o f its properties, such as th e degrees of its neighbours. Barabasi 
and Albert (IBarabasi and Albertl . Il999l ) applied the mechanism of preferential 



attachment to an evolving network model and showed how this resulted in the 
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degree distributions becoming scale-free for long enough times. For this to 
work, attachment had to be linear - i.e., the probability a node with degree k 
has of receiving a new edge is 7c{k) k + q. This results in scale-free stationary 
degree distributions with an exponent 7 = 3 — g. 

Preferential attachment seems to be behind the emergence of many real- 
world, continuously growing networks. However, not all networks in which 
some nodes at times gain (or loose) new edges have a continuously growing 
number of nodes. For example, a given group of people may form an evolv- 
ing social network (IKossinets and Wattd . l2006l ) in which the edges represent 
friendship. Preferential attachment may be relevant here - the more people 
you know, the more likely it is that you will be introduced to someone new - 
but probabilities are not expected to depend linearly on degree. For instance, 
there may be saturations (highly connected people might become less acces- 
sible), threshold effects (hermits may be prone to antisocial tendencies), and 
other non-linearities. The brain may also be a relevant case. Once formed, the 
number of neurons does n ot seem to continually augment, and yet its struc- 
tural topology is dynamic (IKlintsova and Greenoughl . Il999l ). Synaptic growth 
and dendritic arborization have been shown to increase with electric stimula- 

and, in general, the more connected 



tion fiLee et al. 



1980 



Roo et al. 



20081) 



a neuron is, the more current it receives from the sum of its neighbours. 

Barabasi and Albert showed that both (linear) preferential attachment and 
an ever-growing number of nodes are needed for scaling to emerge in their 
model. In a fixed population, their mechanism would result in a fully-connected 
network. However, this is not normally observed in real systems. Rather, just 
as some new edges sprout, others disappear - less used synapses suffer at- 
rophy, unstimulating friendships wither. Often, the numbers of both nodes 
and edges remain roughly constant. The same authors did therefore extend 
their model so as to include the effects of preferential rewiring (which could 
be applied to fixed-si ze networks), although agai n probabilities depended lin- 
early on node degree (lAlbert and Barabasi I2OOOI ). Another mechanism which 
(rou ghly) maintains cons tant the numbers of nodes and edges is node fus- 
ing (IThurner et al.l . 120071 ). once more according to linear probabilities. As to 



nonlinear preferential attachment, the (gro wing) BA mode 



take power-law probabilities into account (IKrapivsky et al. 



was extended to 



20001), although 



the solutions are only scale free for the linear case. 

In this note we present an evolving network model with preferential rewiring 
according to nonlinear (power-law) probabilities. The number of nodes and 
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edges is conserved but the topology evolves, arriving eventually at a macro- 
scopically (nonequilibrium) stationary state - as described by global properties 
such as the degree distribution. Depending on the exponents chosen for the 
rewiring probabilities, the final state can be either fairly homogeneous, with 
a typical size, or highly heterogeneous, with the emergence of starlike struc- 
tures. In the critical case marking the transition between these two regimes, 
the degree distribution is shown to follow a nonlinear diffusion equation. This 
describes a tendency towards stationary states that arc characterized either by 
scale-free or by mixed scale-free distributions, depending on parameters. 

Our model consists of a random network with N nodes of respective de- 
gree ki, i = 1,2,...,N, and jN (k) edges. Initially, the degrees have a given 
distribution p{k,t = 0). At each time step, one node is chosen with a proba- 
bility which is a function of its degree, p{ki). One of its edges is then chosen 
randomly and removed from it, to be reconnected to another node j chosen 
according to a probability 'n:{kj). That is, an edge is broken and another one is 
created, and the total number of edges, as well as the total number of nodes, 
is conserved. The functions 7r(/c) and p{k) are arbitrary, but we shall explic- 
itly illustrate here 7r(A;i) ~ kf and p{ki) ~ /cf that capture the essence of a 
wide class of nonlinear monotonous response functions and are easy to handle 
analytically. 

The probabilities tt and p a given node has, at each time step, of increasing 
or decreasing its degree can be interpreted as transition probabilities between 
states. The expected value of the increment in a given p{k, t) at each time 
step, Ap(/c,t), may then be written as 



^MA^^k-iTk-Mk-i.t) 

+ {k+i)n^^p{k+i,t) (A.i) 

-{k^k-' + kf'k/) p{k,t), 

where ka — ka{t) — exists, any stationary solution must 

satisfy the condition pst{k + 1) {k + l)'^ k^ — Pst{k) k^kf which, for A; » 1, 
implies that 

dpst{k) _ ( 



dk \ kf {k + ly 



1 Pst(fc). (A.2) 



± 

Therefore, the distribution will have an extrcmum at k^ = {kf /k^^^ "^'^ (where 
we have approximated k^ — k^ + I). If a < /3, this will be a maximum. 
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signalling the peak of the distribution. On the other hand, if a > /3, will 
correspond to a minimum. Therefore, most of the distribution will be broken 
in two parts, one for k < kg and another for k > k^. The critical case for 
a = /3 will correspond to a monotonously decreasing stationary distribution, 
but such that \im.k^^dpst{k) / dk = 0. In fact, Eq. flA.l|) is for this situation 
{a = P) the discretized version of a nonlinear diffusion equation, 

after dynamically modifying the time scale according to t = t/ka{t). Ignoring, 
for the moment, border effects, the solutions of this equation are of the form 

Pst(fc)~Afc-" + 5A;-°+\ (A.4) 

with A and B constants. If a > 2, then given A we can always find a B which 
allows Pst{k) to be normalized in the thermodynamic limit 0. For example, 
if the lower limit is A; > 1, then B = (a — 2) [1 — A/{a — 1)]. However, if 
1 < a < 2, then only A can remain non-zero, and Pst{k) will be a pure power 
law. For a < 1, both constants must tend to zero as — )■ oo. In finite 
networks, no node can have a degree larger than — 1 or lower than 0. In 
fact, one would usually wish to impose a minimum nonzero degree, e.g. k > 1. 
The temporal evolution of the degree distribution is illustrated in Fig. lA.li 
This shows the result of integrating Eq. flA.ip for k > 1, different times, (3 = 1, 
and three different values of a, along with the respective values obtained from 
Monte Carlo simulations. 

The main result may be summarized as follows. For a < P, the network 
will evolve to have a characteristic size, centred around (k). At the critical 
case a = (3, all sizes appear, according either to a pure or a composite power 
law, as detailed above. 

If we impose, say, k > 1, then starlike structures will emerge, with a great 
many nodes connected to just a few hubs[f. 

Figure IA.2I illustrates the second order phase transition undergone by the 
variance of the final (stationary) degree distribution, depending on the expo- 
nent a, where (3 is set to unity. It should be mentioned that this particular 



^Although all moments of k will diverge unless B = 0. 

^There is a finite-size effect not taken into account by the theory - but relevant when 
a > P - which provides a natural lower cutoff for pst{k): if there are, say, m nodes which 
are connected to the whole network, then the minimum degree a node can have is m. 
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Figure A.l: Degree distribution p{k,t) at four different stages of evolution: 
t = 10^ [(yellow) squares], 10^ [(blue) circles], 10^ [(red) triangles)] and 10^ 
MCS [(black) diamonds]. From top to bottom panels, subcritical {a = 0.5), 
critical {a = 1) and supercritical {a = 1.5) rewiring exponents. Symbols from 
MC simulations and corresponding solid lines from numerical integration of 
Eq. f[0]) . /3 = 1, (A;) = 10 and A = 1000 in all cases. 



case, 13=1, corresponds to edges being chosen at random for disconnection, 
since the probability of a random edge belonging to node i is proportional to 

This topological phase transition is similar to the ones that have been 
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Figure A. 2: Adjusted variance a^/ (k)'^ of the degree distribution after 2 x 10^ 
MCS against a, as obtained from MC simulations, for system sizes = 800 
[(yellow) squares], 1200 [(blue) circles], 1600 [(red) triangles] and 2000 [(black) 
diamonds]. Top left inset shows final degree distributions for a = 0.5 [light 
gray (blue)], 1 [dark gray (red)] and 1.5 (black), with A^ = 1000. Bottom right 
inset shows typical time series of o"^/ (A;)^ for the same three values of a and 
A^ = 1200. In all cases, /3 = 1 and {k) = 10. 



described in equilibrium network ensembles defined via an energy function 



in the so-called synchron i c approach to network analvsis ([Farkas et al.l . 12004 ; 



Park and Newman. 



2004; 



Burda et al 



2004 



Derenyi et al.l . 120041 ) . However, 



our (nonequilibrium) model does not come within the scope of this body of 
work, since the rewiring rates cannot, in general, be derived from a potential. 
Furthermore, we are here concerned with the time evolution rather than the 
stationary states, making our approach diachronic. 

Summing up, in spite of its simplicity, our model captures the essence of 
many real-world networks which evolve while leaving the total numbers of 
nodes and edges roughly constant. The grade of heterogeneity of the station- 
ary distribution obtained is seen to depend crucially on the relation between 
the exponents modelling the probabilities a node has of obtaining or loos- 
ing a new edge. It is worth mentioning that the heterogeneity of the degree 
distribution of a random network has been found to determine many rele- 
vant behaviours and magnitudes such as its clustering coefficient and mean 
minimum path (INewmanl . l2003d ). critical values related to the dynamics of 
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excitable networks ( iJohnson et all l2008l ). or the synchronizability for systems 
of couple d oscillators (since thi s depe nds on the spectral gap of the Laplacian 
matrix) ( iBarahona and Pecoral . 120021 ) . 

The above shows how scale-free distributions, with a range of exponents, 
may emerge for nonlinear rewiring, although only in the critical situation in 
which the probabilities of gaining or loosing edges are the same. We believe 
that this non-trivial relation between the microscopic rewiring actions (gov- 
erned in our case by parameters a and /3) and the emergent macroscopic degree 
distributions could shed light on a class of biological, social and communica- 
tions networks. 



Appendix B 

Effective modularity of highly 
clustered networks 



The number of nodes within a radius r is n{r) = A^r'^, with a constant. 
We shall therefore assume a node with degree k to have edges to all nodes 
up to a distance r(k) = (k/AdY^'^, and none beyond (note that this is not 
necessarily always feasible in practice). To estimate A, we shall first calcu- 
late the probability that a randomly chosen edge have length x. The chance 
that the edge belong to a node with degree k is 7T{k) ~ kp{k) (where p{k) is 
the degree distribution). The proportion of edges that have length x among 
those belonging to a node with degree k is i/(x|/c) = dAfix'^~^ /k if A^x'^ < k, 
and otherwise. Con sidering, for example, scale-free networks (as in Ref. 



flRozenfeld et al.l . l2002l)), so that the degree distrib ution is p{k) ~ /c in some 



interval G [fco, kmax] fiBarabasi and Albert 



19991 ). and integrating over p{k), 



we have the distribution of lengths, 

P(x) = (Const.) / TT{k)iy{k\x)dk = d{j - 2)x'^'^^^-^^+^\ 

where we have assumed, for simplicity, that the network is sufficiently sparse 
that max(/co, Ax'^) = Ax'^, Vx > 1, and where we have normalised for the 
interval 1 < x < oo; strictly, x < {kmax/AY^'^, but we shall also ignore this 
effect. Next we need the probability that an edge of length x fall between two 
compartments of linear size /. This depends on the geometry of the situation 
as well as dimensionality; however, a first approximation which is independent 
of such considerations is 



Pout{x) = min (^1 



X 
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We can now estimate the modularity A as 

/OO -1 

Fig. 17.51 shows how A depends on 7 for d = 2 and various box sizes. 



Appendix C 
Nestedness of networks 



The property of nestedness has for some time aroused a fair amount of interest 
as regards ecological networks - especially since a high nestedness in mutu- 
alistic systems has been shown to enhance biodiversity. However, because it 
is usually estimated with software, no analytical work has been done relating 
nestedness with other network characteristics, and consequently comparisons 
of experimental data with null-models can only be done computationally. We 
suggest a slightly refined version of the measure recently defined by Bastolla et 
al. and go on to study the effect of the degree distribution and degree correla- 
tions ( assort at ivity). Our work provides a benchmark against which empirical 
networks can be contrasted. 



C.l Introduction 



The intense study that complex networks have undergone over the past decade 
or so has shown how important topological features can be for properties of 
complex systems, such a s dynami c al behaviour, spreading of inf ormation, re- 



silience to attacks, etc. (INewman 



2003c 



Boccaletti et al. 



20061). A paradig 



matic case is that of ecosystems. The solution to May's paradox flMavl . Il973l ) 
- the fact that large ecosystems seem to be especially stable, when theory 
predicts the contrary - is still not clear, but it is widely suspected that there 
is some structural feature of ecological networks which as yet eludes us. One 
aspect of such networks, which has been studied for some time by ecologists 
and may be related to this problem, is called nestedness. Loosly speaking, a 
network - say of species and islands, linked whenever the former inhabit the 
latter - is said to be highly nested if the species which exist on scarcely popu- 
lated islands tend always to be found also on those islands inhabited by many 

112 



C.l Introduction 



113 



I ■■■ 



II 



Figure C.l: Ma ximally packed matrix re preseuting a network of plants and 
islands off Perth ( Abbott and BlacD . 1980 ) (because the network is bipartite, 
the adjacency matrix is composed of four blocks: two identical to this ma- 
trix, the other two composed of zeros). Data, image and line obtained from 
NESTEDNESS CALCULATOR, which returns a "temperature" of T = 0.69° 
for this particular network. 



different species. This can be most easily seen by graphically representing a 
matrix such that animals are columns and islands are files, with elements equal 
to one whenever two nodes are linked and zero if not. If, after ordering each 
kind of node by degree (number of neighbours), all the ones can be quite neatly 
packed into one corner, the network is considered highly nested. This is done 
in Fig. IC.ll for a network of plants inhabiting islands off Perth. This rather 
vague concept is usually measured with software for the purpose. For Fig. 
[cm we have used NESTEDNESS CALCULATOR, which estimates a curve 
of equal density of ones and zeros, calculates how many ones and zeros are on 
the "wrong" side and by how much, and returns a number between and 100 
called "temperature" by analogy with some system such as a subliming solid. 
A low temperature indicates high nestedness. To determine how significantly 
nested a given network is, the usual procedure is to generate equivalent ran- 
dom networks computationally (with sone constraint such as the number of 
edges or the degree of each node being conserved) and estimate how likely it 
is that such a network be "colder" than that of the data. 



Bastolla et al. (iBastoUa et al.l . |2009| ) have recently shown how symbiotic 
interactions can reduce the effective competition between two species, say of 
insect, via common symbiotic hosts - such as plants they pollinate. These 
authors define a measure to take into account the average number of shared 
partners in these mutualistic networks, and call it "nestedness" because it 
would seem to be related to the concept referred to above. They go on to show 
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evidence of how the nestedness of empirical mutuahstic networks is correlated 
with the biodiversity of the corresponding ecosystems. This beneficial effect 
"enemy" nodes can gain from sharing "friendly" partners is not confined to 



ecosystems. It is expected al so to play a role, 



or other economic systems fiSugihara and Yd . 120091 ). The principle is simple. 



or instance, in financial networks 



Say nodes A and B are in competition with each other. An increase in A will 
be to B's detriment, and viceversa; but if both A and B engage in a symbiotic 
relationship with node C, then A's thriving will stimulate C, which in turn will 
be helpful to B. Thus, the effective competition between A and B is reduced, 
and th e whole system becomes rnore st able and capable of sustaining more 
nodes (jPomfnguez-Chibetm et al.l , 120111 ). 

In Ref. ( Johnson and Munozl ) we take up this idea of shared neighbours 
(though characterised with a slightly different measure, for reasons we shall 
explain in Section IC.2p and study analytically the effect of other topological 
properties, such as the degree distribution and degree-degree correlations. This 
allows us to contrast empirical data with null-models and thus test for statis- 
tical significance with no need of computer randomisations. We also comment 
on how mutual-neighbour structure could develop in systems of interdependent 
networks (such as cor npetition and symbios is) so as to minimise the risk of a 
"cascade of failures" f lBuldvrev et al.l . |2010| ). Although we are not here con- 
cerned specifically with neural systems, a description of this work is included 
as an appendix since it serves as an e xample application of the method put 
forward in Ref. (jjohnson et al.l . l2010bl ) and presented in Chapter |5l 



C.2 Definition 

Consider a network with N nodes defined by the adjacency matrix a: the 
element dij is equal to the number on links, or edges, from node j to node 
i (typically considered to be either one or zero). If d is symmetric, then 
the network is undirected and each node i can be characterised by a degree 
ki = J2j (^ij- (If it is directed, i has both an in degree, k]^ = dij, and an out 
degree, kf^^ = dji] we shall focus here on undirected networks, although 
most of the results could be easily extended to directed ones.). 



BastoUa et al. (IBastolla et al.l . l2009l ) have shown that the effective compe- 



tition between two species (say two species of insect) can be reduced if they 
have common neighbours with which they are in symbiosis (for instance, if 
they both pollinate the same plant). Therefore, in mutualistic networks (net- 



C.3 The effect of the degree distribution 
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works of symbiotic interactions) it is beneficial to the species at two nodes i 
and j for the number of shared symbiotic partners, riij = J2i ^-'Aj = to 
be high. Going on this, and assuming the network is undirected, the authors 
suggest taking into account the following measure: 

which they call nestedness because it would seem to be highly correlated with 
the measures returned by nestedness software. Note that, although the authors 
were considering only bipartite graphs, this characteristic is not imposed in the 
above definition. In this work, we shall take up the idea of the importance of 
riij, but use a slightly different measure of nestedness, for several reasons. One 
is that T]B has a serious shortcoming. If we commute the sumj^ in the numer- 
ator of Eq. (IC.ip . we find that the result only depends on the heterogeneity of 



the degree distribution: Yliij'^ij — YliiYlii'^iiYlij'^ij — N{k'^). Also, although 
the maximum value fiij can take is min(/cj, kj), this is not necessarily the best 
normalisation factor, since the expected number of paths of length 2 connect- 
ing nodes i and j depends on both ki and kj (as we show explicitely in Section 
IC.Sp . Furthermore, it can sometimes be convenient to have a local measure of 



nestedness. For these reasons, we shall use 

which is defined for every pair of nodes This allows for the consideration 

of a nestedness per node, rji = J2j Vij^ of the global measure 

ij 

C.3 The effect of the degree distribution 

Most networks have quite broad degree distributions p{k), most notably the 
fairly ubiquitous scale-free networks, for which they follow power-laws, p{k) ~ 
k~'^. Since this heterogeneity tends to have an importante influence on any 
network measure, it will be useful to take this effect into account analytically. 
As is standard, the null-model we shall use to do this is the configurational 



^In an undirected network, X]i<j = ^ Sij! '^^ shall always sum over all i and j, since it 
is easier to generalise to directed networks and often avoids writing factors 2. 
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model (INewmanl . |2003g): the set of random networks wired according to the 
constraints that a given degree sequence (fci, k^) is respected, and also that 
there be no degree-degree correlations. The expected value of an element of 
the adjacency matrix for networks belonging to this ensemble is 

We shall use a line, (■), to represent expected values given certain constraints, 
and angles, (■), for averages over nodes of a given networljfl. For the case of 
the adjacency matrix, we use the notation i^j = dij for clarity and coherence 
with previous work. Plugging Eq. f lC.4p into Eq. flC.2p . we have the expected 
value in the configuration ensemble, 

'hj = J^=Vconf. (C.5) 

Since rjc is independent of i and j, it coincides with the expected value for the 
global measure, fj = rjconf ^ a fact that justifies the normalisation chosen in 
Eq. (IC.2p . It is obvious from Eq. (IC.Sp that degree heterogeneity will have 
an important effect on rj. Therefore, if we are to capture aspects of network 
structure other than those directly induced by the degree distribution, it will 
in general be useful to consider the nestedness normalised to this expected 
value. 

Although r] is unbounded, it has the advantage that it is equal to unity for 
any uncorrelated random network, independently of its degree heterogeneity, 
thereby making it possible to detect non-trivial structure in a given empirical 
network without the need for computational randomisations. 



C.4 Nestedness and assortativity 

In the configuration ensemble, the expected value of the mean degree of the 
neighbours of a given node is knn,i = k^^ J2j ^ij^j = (^^) / i^)^ which is indepen- 
dent of fcj. However, real networks usually display degree-degree correlations, 
with the result that knn,i = knn{ki). If knnik) increases (decreases) with /c. 



^In this case, for instance, the network considered for (fc) is any of the members of the 
ensemble, since they aU have the same mean degree by definition. 
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the network is assortative (disassortative). A measure of this pheii o menon 



is Pearson's coefficien t apphed to the edges (INewman 



2nn3c 



2002 



Boccaletti et al. 



2003a; 



20061): r = {[kk'i] - [kif)/i[kf] - [k]"^), where k and k'l are the 



degrees of each of the two nodes belonging to edge /, and [■] = {{k)N) ^ 
is an average over edges. Writing = ^iji')^ ''^ can be expressed as 



{k){ekUk))-{ky^ 

{k){k-^) - (F)2 



(C.7) 



The ensemble of all networks with a given degree sequence (/ci, ...fc^v) con- 
tains a subset for all members of which knn{k) is constant (the configuration 
ensemble), but al so subsets displaying o ther functions knn{k). 

In Chapter O (jJohnson et al.l . l2010bl ) we showed that there is a one-to-one 
mapping between any mean- nearest-neighbour function knn{k) and its corre- 
sponding mean-adjacency- matrix e, which is as follows: writing knn{k) as 



^nn (^) 



(k) 



+ / dufiu)a 



k 



(C.8) 



with cTjy+i = {k^^ ) — {k){k'^) (which can always be done), the corresponding 
matrix e takes the form 



{k)N 



dv 



M 

N 



{kikj)'^ 



(C.9) 



A,B>0 ( IBoccaletti et al. 



2006 



In man y empirical networks, knnik) has the form knn ( k) = A + Bk^, with 

20011 ) - the mixing 



Pastor-Satorras et al. 



being assortative (disassortative) if (3 is positive (negative). Such a case is 
fitted by Eq. ([USD if f{jy) = C[6{iy - /3 - l)a2/cT^+2 - S{iy - 1)], with C a 
positive constant, since this choice yields 



knn{k) = J^ + Ca2 



k" 



{kP+^) 



1 

W) 



(C.IO) 



After plugging Eq. fIC.lOp into Eq. flC.7p . one obtains: 



- f {k){k^+^) - 

^ ~ Jk^ V (A;)(A;3) - {k^y 



(C.ll) 



It turns out that the configurations m ost likely to arise natura lly (those with 
maximum entropy) usually have C ~ 1 (jJohnson et al.l . 1201 Obi ) (c.f. Chapter 
\5^. Therefore, and for the sake of analytical tractability, we shall do as in 



118 



Chapter C. Nestedness of networks 



Chapter [6] and consider this particular cas^ - that is, we shall use 



1 / ^2 

N \ (J/3+2 



+1 



+ k, + kj - (k) 



(C.12) 

Substituting the adjacency matrix for this expression in the definition of fj (Eq. 
flC.6P ). we obtain its expected value as a function of the remaining parameter 
/3: 



l + {a2- ajpp) 2 



,{k^){k-^) 
' (A;/5+i) 



(A;/5+i) 



(C.13) 



where = 0-2/0-^+2 and p,3 = (A;2(/'+i)) - Note that Vq = 1. 

Fig. IC.2I shows the value of f/^ given by Eq. (1C.13P against the assor- 
tativity r for various scale-free networks. Nestedness is seen to grow very 
fast with increasing disassortativity (decreasing negative r), while in general 
slightly assortative networks are less nested than neutral ones. However, highly 
heterogeneous networks (7 — )■ 2) show an increase in r/^ for large positive r. 
Fig. IC.3I shows a plot of nestedness against assortativity for the selection of 
empirical networks listed in Table IC.4I Although these networks are highly 
disparate as regards size, density, degree distribution, etc., it is apparent from 
the similarity to Fig. IC. 21 that the main contribution to r^^ comes indeed from 
the assortativity. 



C.5 Bipartite networks 

Mutualistic networks are usually bipartite: two sets of nodes exist such that 
all edges are bet ween nodes in one s et and those of another. The ones con- 



sidered in Ref. ( iBastoUa et al.l . |2009| ). for instance, are composed of animals 
and plants which interact in symbiotic relations of feeding-pollination; these 
interactions only take place between animals and plants. Let us therefore con- 
sider a bipartite network and call the sets Fi and F2, with rii and n2 nodes, 
respectively [rii + n2 = N). Using the notation for averages over set Fj, 
the total number of edges is {k)in2 = {k)2ni = \{k)N. Assuming that the 
network is defined by the configuration ensemble, though with the additional 



■^Note that C = 1 corresponds to removing the hnear term, proportional to kikj, in Eq. 
(|C.9p . and leaving the leading non-linearity, [kikj)^^^ , as the dominant one. 
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Figure C.3: Nestedness against assortativity (as measured by Pearson's cor- 
relation coefficient) for data on a variety of networks. Blue squares are food 
webs (Table [Q4|) and red circles are networks of all other types (Table IC^ . 



constraint of being bipartite, the probability of node / being connected to node 
i is 

ki ki 



eu = 2 



{k)N 
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Food web 


r 


V 


(fc) 






Little Rock lake 


-0.343 


1.219 


20.4 


92 


0.73 


Ythan Estuary (w/p) 


-0.249 


1.323 


8.9 


82 


0.93 


Stony Stream 


-0.201 


1.163 


14.7 


109 


0.75 


Canton Creek 


-0.196 


1.171 


13.5 


102 


0.69 


Skipwith Pond 


-0.194 


0.891 


14.2 


25 


0.37 


El Verde 


-0.183 


1.088 


18.4 


155 


0.88 


Caribbean Reef (small) 


-0.172 


1.000 


19.7 


50 


0.49 


St. Martin Island 


-0.165 


1.071 


9.3 


42 


0.56 


UK Grassland 


-0.125 


0.907 


2.8 


61 


0.82 


Chesapeake Bay 


-0.123 


0.801 


4.1 


31 


0.60 


NE US Shelf 


-0.088 


0.971 


34.3 


79 


0.45 


Coachella Valley 


0.043 


0.857 


14.6 


29 


0.41 


St. Mark's Estuary 


0.118 


0.816 


8.5 


48 


0.55 



Table C.l: Food webs appearing in Fig. IC.3I (listed from least to most 
assortative) : r is the assortatiyity an d v the nestedness. The origins of all 
data cited in Ref. fjPunne et al.l . l2004l ). and kindly provided to us by Jennifer 
Dunne. 



if they belong to different sets, and zero if they are in the same one. Proceeding 
as before, we find that the expected value of the nestedness for a bipartite 
network is 



1 

iV2 



kiki kikj 



S kik. ^ 



kiki kikj 
{k)in2 {k)2ni 



ni(P)2 + n2(fc^)] 
{k)i{k)2{ni + n2) 



(C.14) 



Interestingly, if rii = ?t,2, the fact that the network is bipartite has no effect on 
the nestedness: r^^jp = rjconf- 



C.6 Overlapping networks 



If the adjacency matrix d describes a mutualistic network, the benefit to its 
being nested resides in a counteraction of the competition matrix c, which 
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Network 



(fc) N a/{k) Ref. 



Pnlif 1 ppi 1 hi no"c; 


-0.221 


1 496 


22.4 


1490 


1.62 


Metabnlir 


-0.220 


1 688 


9 


453 


1 87 


rolitical books 


n "1 oo 

-0.138 


0.996 


O A 

8.4 


104 


0.65 


Adjectives and nouns 


-0.125 


1.057 


7.6 


111 


0.89 


Dolpliins 


-0.063 


0.922 


5.1 


61 


0.58 


Power grid 


0.003 


0.834 


2.7 


4940 


0.67 


Neural 


0.005 


0.907 


5.9 


306 


0.81 


Jazz musicians 


0.020 


0.924 


27.6 


198 


0.63 


Email 


0.078 


0.923 


9.6 


1133 


0.97 


American football 


0.133 


0.904 


10.6 


114 


0.08 


PGP 


0.239 


0.867 


4.6 


10680 


1.77 


High-energy arXiv 


0.294 


0.533 


3.8 


8360 


1.14 


Net-science arXiv 


0.462 


0.443 


3.45 


1588 


1.00 



fAdamic and Glance . 
fPuch and Arenas . 2005) 
(Krebs) 

fNewman . 2006 1 
fLusseau et al. . 20031 
fWatts and Strogatz . 19981 
fWatts and Strogatz . 19981 
fP.Gleiser and Danon . 20031 
fGuimera et al. . 20031 
fGirvan and Newman . 20021 
fBogfia et al. . 20041 
(Newman, 2001) 
fNewman. 20061 



Table C.2: Empirical networks appearing in Fig. IC.3I (listed from least to most 
assortative) : r is the assortativity and u the nestedness. All data available on 
the personal Web pages of Alex Arenas, Mark Newman and Duncan Watts. 



takes into account the extent to which one species is detrimental to another 
due to predation, sharing of resources, etc. From this point of view, it may be 
interesting to study to what extent matrices c and overlap (note that both 
networks have the same nodes, but different edges). Presumably, if ecological 
networks are assembled in such a way that effective competition is minimised, 
this overlap should be higher than randomly expected. On the other hand, a 
certain degree of overlap may also arise from the fact that species interacting 
symbiotically with the same host are perhaps more than averagely likely to 
be phylogenetically close and/or phenotypically similar, leading (as Darwin 
noted) to a higher competition element. 
In any case, a measure of this overlap is 



I] 



(C.15) 



where {■)c represents an average over the competion network; similarly, (■)„ 
will stand for an average over the mutualistic network. If the two networks 
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are mutually uncorrelatecl^ - i.e., if the existence of an edge in one provides no 
information as to whether there is a corresponding one in the other - we can 
write 

Using '^ij{a'^)ii_= {k'^)aN, and assuming that c is normalised so that Qj = 
{k)cN, we hav€§ 

rune ^ (C.17) 

which only depends on the heterogeneity of the degree distribution of the 
mutualistic network. Again, it may be useful to consider the overlap normalised 
to this value, 

r 1 



' unc 

This measure will equal unity when there is no statistical relation between 
the competition matrix and the mutualistic one, but can be expected to be 
greater if indeed such an overlap were contributing to a reduction in effective 
competition. 

It has recently been show n that interconnected networks are prone to dan- 
gerous "cascades of failures" ( iBuldyrev et al.l . 120101 ) . It seems that the northen 
half of Italy was once left temporarily with no electric supply due to failures in 
the power-grid closing down dependent internet servers, which in turn further 
disrupted the grid, until many nodes of both networks were rendered dysfunc- 
tional. If two inter-dependent networks were to coincide perfectly (r = 1), the 
resilience of the system to node removal would be the same as that of just one 
network; however, lower overlap leads to increased vulnerability to such cas- 
cades of failures. Since the extinction of a species can result in its host species 
also going extinct, such cascades of failures may be a threat to mutualistic 
systems. In such it would seem that a high overlap r, as defined here, 

between the competition matrix and the mutualistic one would minimise this 
possibility. It would be interesting to test this experimentally. 



"^Note that we are saying nothing of the internal correlations that each network may 
display. 

^The competition matrix will in general be weighted, as could be the mutualistic one; we 
shall treat both as though they were not, but using weighted networks would only influence 
results by a normalisation factor. 
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C.7 Discussion 

Whether or not the topological feature here described should be considered a 
measure of nestedness as it is usually understood in ecology is not clear. What 
is certain is that interactions between dynamical elements that are mediated 
by third parties, or common neighbours, can be relevant in a wide variety of 
settings. We have mentioned the paradigmatical case of ecosystems as well 
as financial and communications networks. But other examples spring easily 
to mind. For instance, two excitatory neighbouring neurons might have their 
mutual effect dampened if they share inhibitory neighbours. Genetic networks 
are riddled with motifs such that switches activate or inactivate each other 
indirectly, via common neighbours. As we have shown, there are nontrivial 
relationships between nestedness, as it is here defined, and other topological 
features. If it turns out that this network property is indeed relevant for many 
complex systems, then we hope the null models we have laid out and analysed 
will prove useful in assessing its functional significance. 



Appendix D 



Publications derived from the 
thesis 



D.l Journals and book chapters (the most rel- 
evant ones marked with an asterisk) 

1. * Cluster Reverberation: A mechanism for robust short-term memory 
without synaptic learning, S. Johnson, J. Marro, and JJ. Torres, submit- 
ted, arXiv: 1007.3122 

2. * Enhancing neural-network performance via assortativity, S. de Fran- 
ciscis, S. Johnson, and J.J. Torres, Physical Review E 83, 036114 (2011) 

3. Why are so many networks disassortative? S. Johnson, J.J. Torres, J. 
Marro, and M.A. Muhoz, AIP Conf. Proc. 1332, 249-50 (2011) 

4. Shannon entropy and degree-degree correlations in complex networks, S. 
Johnson, J.J. Torres, J. Marro, and M.A. Munoz, "Nonhnear Systems 
and Wavelet Analysis" , Ed. R. Lopez-Ruiz, WSEAS Press, pp. 31-35 
(2010) 

5. * Entropic origin of disassortativity in complex networks, S. Johnson, 
J.J. Torres, J. Marro, and M.A. Munoz, Physical Review Letters 104, 
108702 (2010) 

6. * Evolving networks and the development of neural systems, S. Johnson, 
J. Marro, and J.J. Torres, Journal of Statistical Mechanics (2010) P03003 
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7. Excitable networks: Nonequilibrium criticality and optimum topology, 
J.J. Torres, S. de Pranciscis, S. Johnson, and J. Marro, International 
Journal of Bifurcation and Chaos 20, 869-875 (2010) 

8. Nonequilibrium behavior in neural networks: criticality and optimal per- 
formance, J.J. Torres, S. Johnson, J.F. Mejias, S. de Pranciscis, and J. 
Marro, "Advances in Cognitive Neurodynamics (II)" Eds. R. Wang and 
F. Gu, pp 597-603, Springer, 2011, ISBN: 978-90-481-9694-4, Proceed- 
ings of Second International Conference on Cognitive Neurodynamics 
(ICCN2009), Hangzhou 15-19 November 2009. 

9. Development of neural network structure with biological mechanisms, S. 
Johnson, J. Marro, J.F. Mejias, and J.J. Torres, Lecture Notes in Com- 
puter Science 5517, 228-235 (2009) 

10. Switching dynamics of neural systems in the presence of multiplicative 
colored noise, J.F. Mejias, J.J. Torres, S. Johnson, and H.J. Kappen, 
Lecture Notes in Computer Science 5517, 17-23 (2009) 

11. * Nonlinear preferential rewiring in fixed-size networks as a diffusion 
process, S. Johnson, J.J. Torres, and J. Marro, Physical Review E 79, 
050104(R) (2009) 

12. * Functional optimization in complex excitable networks, S. Johnson, J.J. 
Torres, and J. Marro, EPL 83, 46006 (2008) 

13. Excitable networks: N on- equilibrium criticality and optimum topology, 
J.J. Torres, S. de Franciscis, S. Johnson, and J. Marro, "Modelling and 
Computation on Complex Networks and Related Topics" , Eds. Criado, 
Gonzalez- Vias, Mancini and Romance. Proceedings of the conference 
"Net- Works 2008", 185-192, ISBN:978-84-691-3819-9. 

14. Topology-induced instabilities in neural nets with activity- dependent synapses, 
S. Johnson, J. Marro, and J. J. Torres, "New Trends and Tools in Com- 
plex Networks", Eds. Criado, Pello and Romance. Proceedings of the 
conference "Net- Works 2007", 59-71, ISBN:978-84-690-6890-8. 

D.2 Abstracts 

1. Network topology and dynamical task performance, S. Johnson, J. Marro, 
and J.J. Torres, AIP Conf. Proc. 1091, 280 (2009) 



126 



Chapter D. Publications derived from the thesis 



2. Constructive chaos in excitable networks with tuneable topologies, S. John- 
son, J. Marro, and J.J. Torres, XV Congreso de Fsica Estadstica FisEsOS, 
104 (2008) 

3. The effect of topology on neural networks with unstable memories, S. 
Johnson, J. Marro, and J.J. Torres, AIP Conf Proc. 887 261 (2006) 

4. Relationship between the solar wind and the upper-frequency limit of 
Saturn Kilometric Radiation, M.Y. Boudjada, P.H.M. Galopeau, H.O. 
Rucker, A. Lecacheux, W.S. Kurth, D.A. Gurnctt, U. Taubencshuss, 
J.T. Steinberg, S. Johnson, and W. VoUerr, European Geosciences Union 
(2006) 
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